Choosing a Significance Test. Student Resource Sheet

Similar documents
Probability and Statistics. Chapter 1

12.1 Inference for Linear Regression. Introduction

Chapter 1: Exploring Data

Before we get started:

Chapter 1. Picturing Distributions with Graphs

1.4 - Linear Regression and MS Excel

Simple Linear Regression

Understandable Statistics

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

CHAPTER 3 Describing Relationships

ANOVA in SPSS (Practical)

Unit 1 Exploring and Understanding Data

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Statistical questions for statistical methods

Analysis and Interpretation of Data Part 1

Chapter 1: Explaining Behavior

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Regression Including the Interaction Between Quantitative Variables

AP Statistics. Semester One Review Part 1 Chapters 1-5

MMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?

appstats26.notebook April 17, 2015

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Statistics Assignment 11 - Solutions

Business Statistics Probability

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

Lesson 1: Distributions and Their Shapes

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

The Pretest! Pretest! Pretest! Assignment (Example 2)

bivariate analysis: The statistical analysis of the relationship between two variables.

9 research designs likely for PSYC 2100

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

Simple Linear Regression the model, estimation and testing

Addendum: Multiple Regression Analysis (DRAFT 8/2/07)

Statistics Spring Study Guide

AP STATISTICS 2009 SCORING GUIDELINES

One-Way Independent ANOVA

Unit 7 Comparisons and Relationships

From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

SBS200 Exam 2 Review. Feb TA: Hyungjun Suh

AP Stats Review for Midterm

CHILD HEALTH AND DEVELOPMENT STUDY

Statistics and Probability

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Analysis of Variance (ANOVA) Program Transcript

Statistics. Nur Hidayanto PSP English Education Dept. SStatistics/Nur Hidayanto PSP/PBI

Test 1 Version A STAT 3090 Spring 2018

Statistical techniques to evaluate the agreement degree of medicine measurements

Chapter 1: Introduction to Statistics

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

10. LINEAR REGRESSION AND CORRELATION

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

AP Stats Chap 27 Inferences for Regression

Lessons in biostatistics

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

Reveal Relationships in Categorical Data

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression

Steps in Inferential Analyses. Inferential Statistics. t-test

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

AP STATISTICS 2010 SCORING GUIDELINES

Regression. Lelys Bravo de Guenni. April 24th, 2015

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

Math 075 Activities and Worksheets Book 2:

On the purpose of testing:

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Ordinary Least Squares Regression

Health Consciousness of Siena Students

Data Analysis with SPSS

Correlation and regression

Measuring the User Experience

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

IAPT: Regression. Regression analyses

CHAPTER TWO REGRESSION

Day 11: Measures of Association and ANOVA

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

Tutorial 3: MANOVA. Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016

Lesson 1: Distributions and Their Shapes

q2_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

2.4.1 STA-O Assessment 2

Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences

STA Module 9 Confidence Intervals for One Population Mean

SCATTER PLOTS AND TREND LINES

Two-Way Independent ANOVA

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Choosing the correct statistical test in research

REGRESSION MODELLING IN PREDICTING MILK PRODUCTION DEPENDING ON DAIRY BOVINE LIVESTOCK

Pitfalls in Linear Regression Analysis

Chong Ho. flies through your eyes

Section 1: Exploring Data

Transcription:

Choosing a Significance Test Student Resource Sheet

Choosing Your Test Choosing an appropriate type of significance test is a very important consideration in analyzing data. If an inappropriate test is used, the analysis will not only be meaningless, but also misleading. In order to determine which type of statistical test is most appropriate for your analysis, consider the following three questions: 1. What type of data are you analyzing? 2. How many variables are being measured? 3. How many groups are involved? Type of Data Is your data categorical or quantitative? Sometimes the answer to that question is as easy as asking if the data consists of words (categorical) or numbers (quantitative), but the distinction isn t always so clear. For example, a high school student s grade level could be thought of as categorical (freshman, sophomore, junior, senior), or it could be considered quantitative (9, 10, 11, 12). The distinction depends on how you, the researcher, will use the data. If you are interested in the proportion of the sample falling into each category, treat the data as categorical. If, on the other hand, you are interested in calculating the average grade level of the students in your sample or you want to know how grade level is related to GPA, treat the data as quantitative. Table 1: Determining the type of data Categorical You will analyze how many subjects fall into each category, or what percent fall into each category. Usually not numeric, but can be numeric Examples: Political preference Gender Favorite movie genre Quantitative You will analyze averages, or you will analyze the correlation between two variables. Always numeric Examples: SAT scores Height Cell phone minutes used per month Number of Variables How many variables are being measured? You may have collected data comprised of dozens of variables, but when conducting a significance test, you will not necessarily use all variables in a single test. Focus on the specific question the significance test is answering. For example, if you conducted a survey in which you asked each subject their gender, ethnicity, religious affiliation, political preference, age, height, grade level, GPA, SAT scores, favorite movie genre, and number of cell phone minutes used last month, you have collected data comprised of eleven variables. Before conducting a significance test, you need to focus on a specific question, and then determine how many variables are needed to answer that question. As is often the case in statistics, the answer is not always clear cut.

Table 2: Determining the number of variables Question How is gender related to political views? Is there a correlation between GPA and SAT scores? Do males tend to receive higher SAT scores than females? Number of Variables Two categorical variables (gender, political preference) Two quantitative variables (GPA, SAT score) Two variables, one categorical (gender) and one quantitative (SAT score), or One quantitative variable (SAT math score) across two groups (male and female) Number of Groups Another important consideration when deciding which type of significance test is most appropriate is the number of groups involved. Is the analysis being done on a single group of subjects, or are two or more groups being compared? Comparative studies, those that show how one group differs from another, are often more relevant than single sample studies. If you intend to compare multiple groups, you may choose to sample multiple sub populations, or you may choose to sample from one large population, and then break the sample into sub groups. For example, if you want to conduct a study comparing teenage pregnancy rates among different socio economic groups, you could sample several students from each socio economic group separately, or you could take one large sample and then separate those subjects into different socio economic groups. Is the Test Valid? Once you have determined which type of test is appropriate for your analysis, it is necessary to determine whether the test will yield valid results for your particular set of data. Each of the following significance tests involves calculations that are based on certain assumptions and conditions. If the necessary assumptions and conditions are not met, then the results of the test will not be valid, and the test should not be used. This can often be overcome, however, by increasing your sample size. 1. The One Proportion z Test Note: the term success refers to the category of interest. For example, if you are conducting a one proportion z test to determine if more than half of the sample like horror films, then liking horror films is considered a success and not liking horror films is considered a failure.

2. The Two Proportion z Test Note: the term success refers to the category of interest. For example, if you are conducting a two proportion z test to determine if more males than females like horror films, then liking horror films is considered a success and not liking horror films is considered a failure. 3. The Chi Square Goodness of Fit Test Note: To calculate the expected counts, multiply the sample size by the expected percentage. For example, if you are conducting a chi square goodness of fit test to determine if the distribution of AP scores in your high school differs from the national distribution of AP scores, you would multiply the sample size by the percent of the nation scoring a 5, then by the percent of the nation scoring a 4, etc. This would give the expected counts. 4. The Chi Square Test for Homogeneity Note: To calculate the expected counts, first organize the data in a two way table. For each cell, multiply the row total by the column total, and then divide by the table total. This gives the expected count for each cell. 5. The Chi Square Test for Independence Note: To calculate the expected counts, first organize the data in a two way table. For each cell, multiply the row total by the column total, and then divide by the table total. This gives the expected count for each cell.

6. The One Sample t Test Note: If the sample is sufficiently large (30 or more), the data need not be approximately normal. If you have a moderate sample size (between 15 and 30), construct a modified box plot and verify that there are no strong outliers. If not, you may proceed with the test. If the sample size is small (less than 15), you must not only verify that there are no outliers, but also construct a histogram and verify that the shape of the distribution is approximately symmetric. Alternatively, you could construct a normal probability plot and verify that the plot is linear. 7. The Paired t Test Note: If the sample is sufficiently large (30 or more), the data need not be approximately normal. If you have a moderate sample size (between 15 and 30), construct a modified box plot and verify that there are no strong outliers. If not, you may proceed with the test. If the sample size is small (less than 15), you must not only verify that there are no outliers, but also construct a histogram and verify that the shape of the distribution is approximately symmetric. Alternatively, you could construct a normal probability plot and verify that the plot is linear. 8. The Two Sample t Test Note: If the sample is sufficiently large (30 or more), the data need not be approximately normal. If you have a moderate sample size (between 15 and 30), construct a modified box plot and verify that there are no strong outliers. If not, you may proceed with the test. If the sample size is small (less than 15), you must not only verify that there are no outliers, but also construct a histogram and verify that the shape of the distribution is approximately symmetric. Alternatively, you could construct a normal probability plot and verify that the plot is linear.

9. The One Way ANOVA (Analysis of Variance) F test Note: To determine whether the groups have similar variances, construct a box plot for each group and compare the interquartile ranges. They need not be exactly the same, but should be similar. To determine whether the data for each group are approximately normal, examine the box plot for each group to verify there are no outliers, and examine the histogram for each group to verify the shape is approximately symmetric. Or, for each group, subtract the group mean from the data (producing residuals) and verify normality with a normal probability plot. 10. The Linear Regression t test Note: To determine whether the true relationship between the variables is linear, it is sufficient to verify that the scatter plot of the data is roughly linear. To determine whether the variance is equal for all values of x, graph the least squares regression line along with the scatter plot and verify that the distances between each point and the line are roughly the same. To determine if the data come from a normal population, it is sufficient to verify that the residuals are approximately normal (this can be accomplished by graphing a histogram or normal probability plot of the residuals). 11. Multiple Regression Note: To determine whether the true relationship between the variables is linear, it is sufficient to verify that the scatter plots of the data (each x against the y) is roughly linear. To determine whether the variance is equal for all values of x, graph the least squares regression line along with the scatter plots and verify that the distances between each point and the line are roughly the same. To determine if the data come from a normal population, it is sufficient to verify that the residuals are approximately normal (this can be accomplished by graphing a histogram or normal probability plot of the residuals).