An Introduction to Research Statistics

Similar documents
Psychology Research Process

Designing Psychology Experiments: Data Analysis and Presentation

Designing Psychology Experiments: Data Analysis and Presentation

One-Way Independent ANOVA

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

Measuring the User Experience

Two-Way Independent ANOVA

Intro to SPSS. Using SPSS through WebFAS

Inferential Statistics

Examining differences between two sets of scores

CHAPTER 3 RESEARCH METHODOLOGY

Collecting & Making Sense of

Section 6: Analysing Relationships Between Variables

Undertaking statistical analysis of

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Business Statistics Probability

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015

CHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to

Variability. After reading this chapter, you should be able to do the following:

Still important ideas

AMSc Research Methods Research approach IV: Experimental [2]

CHAPTER ONE CORRELATION

ANSWERS TO EXERCISES AND REVIEW QUESTIONS

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

9 research designs likely for PSYC 2100

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Still important ideas

Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN

Dr. Kelly Bradley Final Exam Summer {2 points} Name

Monday 4 June 2018 Afternoon Time allowed: 2 hours

bivariate analysis: The statistical analysis of the relationship between two variables.

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Designing a Questionnaire

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Applied Statistical Analysis EDUC 6050 Week 4

PTHP 7101 Research 1 Chapter Assignments

Correlation and Regression

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

PSY 216: Elementary Statistics Exam 4

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Psychology Research Process

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

VARIABLES AND MEASUREMENT

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences

Appendix B Statistical Methods

Six Sigma Glossary Lean 6 Society

POST GRADUATE DIPLOMA IN BIOETHICS (PGDBE) Term-End Examination June, 2016 MHS-014 : RESEARCH METHODOLOGY

ADMS Sampling Technique and Survey Studies

SUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK

25. Two-way ANOVA. 25. Two-way ANOVA 371

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

Using SPSS for Correlation

The t-test: Answers the question: is the difference between the two conditions in my experiment "real" or due to chance?

The normal curve and standardisation. Percentiles, z-scores

Collecting & Making Sense of

Overview of Non-Parametric Statistics

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

On the purpose of testing:

PRINCIPLES OF STATISTICS

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Unit 1 Exploring and Understanding Data

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

Research Designs and Potential Interpretation of Data: Introduction to Statistics. Let s Take it Step by Step... Confused by Statistics?

Overview. Survey Methods & Design in Psychology. Readings. Significance Testing. Significance Testing. The Logic of Significance Testing

investigate. educate. inform.

Before we get started:

Lesson 9 Presentation and Display of Quantitative Data

ANOVA in SPSS (Practical)

Things you need to know about the Normal Distribution. How to use your statistical calculator to calculate The mean The SD of a set of data points.

Introduction. Lecture 1. What is Statistics?

Research Manual COMPLETE MANUAL. By: Curtis Lauterbach 3/7/13

Study Guide for the Final Exam

Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

CHAPTER VI RESEARCH METHODOLOGY

Bangor University Laboratory Exercise 1, June 2008

STATISTICS AND RESEARCH DESIGN

Chapter 1: Introduction to Statistics

Analysis of Variance: repeated measures

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

MODULE S1 DESCRIPTIVE STATISTICS

FORM C Dr. Sanocki, PSY 3204 EXAM 1 NAME

Chapter 9: Comparing two means

APPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors

Variable Measurement, Norms & Differences

Introduction to SPSS: Defining Variables and Data Entry

Reliability and Validity checks S-005

isc ove ring i Statistics sing SPSS

Empirical Research Methods for Human-Computer Interaction. I. Scott MacKenzie Steven J. Castellucci

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

10. LINEAR REGRESSION AND CORRELATION

Statistics. Nur Hidayanto PSP English Education Dept. SStatistics/Nur Hidayanto PSP/PBI

Stats 95. Statistical analysis without compelling presentation is annoying at best and catastrophic at worst. From raw numbers to meaningful pictures

Transcription:

An Introduction to Research Statistics An Introduction to Research Statistics Cris Burgess Statistics are like a lamppost to a drunken man - more for leaning on than illumination David Brent (alias Ricky Gervais The Office ) An Introduction to Statistics 9:3-9:5 Introduction 9:5-11:3 Conceptual issues/data entry 11:3-11:5 Coffee/Tea 11:5-1: SPSS demonstration/practical session 1 Course Philosophy Statistics don t have to be frightening Statistics can be simple and straight-forward Use familiar concepts to explain unfamiliar ideas 1: - : : - 3:3 Lunch Tests of frequency and difference Statistics are a useful tool, used with discretion Basic concepts behind common statistical analyses 3:3-3:5 Coffee/Tea 3:5-5: SPSS demonstration/practical session No Greek symbols, minimal equations Statistics/SPSS Useful References Field, A. () Discovering Statistics using SPSS for Windows. London: Sage. Greene, J. & D Oliveira, M. (19, 199) Learning to use statistical tests in psychology. Buckingham: OUP. Pallant, J. (1) SPSS Survival Manual. Buckingham: OUP. Experimental considerations Harris, P. () Designing and Reporting Experiments in Psychology. Buckingham: OUP. Session One: Conceptual Issues Types of variable Function of statistics - what do they describe? Descriptive versus Inferential statistics Frequency distributions Statistical significance Power and effect size Statistical versus Real World significance Sampling issues Measurement issues Considerations for SPSS entry 1

Types of variable SPSS can calculate all these descriptives for you Before we can analyse any data, we need to code it for the computer SPSS can only understand numbers! Four different kinds of variable: Nominal (unordered category) Ordinal (ordered category) Interval (scale) Ratio (scale) Nominal variable Unordered category variable Uses numbers as names for categories Size of number tells us nothing about the differences between categories eg: 1 = male = female or numbers on buses Ordinal variable Interval (scale) variable Ordered category variable Order of categories in data Number tells us that 1 is more than eg: 1 = Gold = Silver 3 = Bronze...or order buses are listed on timetable Continuous, scale variable Equal differences in the numbers indicate equal differences in the size of the attribute being measured Measure has no true zero eg: temperature (centigrade scale) or time of day for buses arrival Ratio (scale) variable Summary Continuous, scale variable Equal differences in the numbers indicate equal differences in the size of the attribute being measured True zero exists eg: height (centimetres) weight (kilograms) or time between each bus Nominal unordered category, name Ordinal ordered category, order Interval scale, no true zero Ratio scale, true zero

So, types of variable Variable Type Example Home town Nominal 1=Exeter, =Taunton, 3=Truro Nearest town Ordinal 1=Exeter, =Taunton, 3=Truro Time left home Interval 11.75=11:5am, 15.5=3:3pm Journey time Ratio. hours, 5.5 hours Types of statistics Descriptive statistics Summarise the data ( shorthand ) Describe the symptoms Inferential statistics Look at relationships within the data Draw inferences Establish the possible causes Types of descriptive statistics Descriptive statistics Average scores Mode (most frequent category) Median (score which splits sample 5:5) Mean ( average ) Measures of dispersion or spread Range (minimum to maximum) Inter-quartile range (central 5% of sample) Standard deviation The central tendency measure is clearly useful, but why is dispersion important? Two distributions, same mean, different standard deviation (or range) Large s.d. platykurtic Frequency Frequency Mean R.T. (ms) Reaction time (ms) Small s.d. leptokurtic Reaction time (ms) Descriptive statistics Which descriptive statistics make sense? Make of car? Degree classification? Temperature? Reaction time? Which don t make any sense? Remember, the numbers in the spreadsheet are real data from real people answering real questions Your statistics should make sense in the relevant experimental context Frequency 1 1 1 1 yr 9 yr yr 7 yr yr 5 yr yr 3 yr yr 1 yr Frequency distributions "The Sun" readership (mental age distribution) How many scores in each category Represents data in visual form Normal distribution (parametric) Age category 3

Frequency distributions Normally distributed? Parametric test Normally-distributed data only e.g. reaction time (in ms) 9 7 5 3 1 1 3 5 7 Normally distributed? The Normal distribution Non-Parametric test Data that is not normally-distributed e.g. number of car crashes in driving career 7 5 3 1 3 5 7 Can increase power of experiment to detect effect Allows us to make assumptions about the data Average values are common Extreme values are rare Must be: Interval or ratio data able to take wide range of values accurately measured Sampling Issues Sampling Issues Reaction time distribution Not necessary for sample to be normally distributed in order to justify using a parametric test Frequency 3 5 15 5 5 N 3 5 1 Variable must be normally distributed in general population In other words, sample must be taken from a population that reflects a parametric (normal) distribution for that particular variable Reaction time (ms)

Types of statistics Descriptive statistics Summarise the data ( shorthand ) Describe the symptoms Inferential statistics Look at relationships within the data Draw inferences Establish the possible causes Indicate how significant the results are What s the answer? What s the question? Inferential statistics May sound like stating the bleeding obvious, but Establishing the specific research question to test is the hardest thing in research What is the relationship between A and B? How are A and B measured? How can differences/similarities be tested? Inferential statistics How can we compare variable scores against one another? Frequency - compares membership of categories (nominal or ordinal) e.g. Chi-square test Differences - compares groups in terms of a score (interval or ratio) e.g. T-test, Analysis of Variance (ANOVA) Association - compares two (or more) variables (ordinal, interval or ratio) e.g. Correlation, Regression, Factor analysis Statistical significance Is the observed effect likely to happen by chance, or is something else responsible? Coin tossing: Heads 5 out of out of 3 out of out of out of 5 out of Statistical significance Coin tossing: Heads: 5 out of even chance 5 out of trick coin? We choose an arbitrary point (5%), beyond which the observed effect is not considered to be due to chance Hence, it must be due to some other factor (i.e. our manipulation) 5 out of = 5% probability 9 out of = less than 5% probability p < 5% or p <.5 Effect Size and Power Effect size magnitude of difference between conditions difference IV makes to DV scores estimate effect size from previous research How bright are the stars? Power capacity to correctly reject null hypothesis sensitivity to relevant effect sizes ability to detect effect How powerful is your telescope? 5

Task 1: Purpose: Effect size: Power: Task : Purpose: Effect size: Power: Effect Size and Power How bright is bright enough? to map the stars to draw constellations large (major stars only required) can be relatively low-powered to map the stars to guide space probe to Alpha Centauri small (major and minor stars required) must be relatively high-powered Effectively the same as deciding the level of statistical significance Real World significance Increasing sample size increases statistical significance Changing the test type can increase statistical significance Is the difference really significant? Does it make a rational difference in the context of the real world setting? Remember, the statistics reflect the effect of real variables on real people Power Power of an experiment to detect relevant effects depends on five main features size of sample significance level one or two-tailed test between or within subjects measure parametric or non-parametric test Parametric vs Non-parametric Parametric tests calculate exact numerical differences between scores (and assume normal distribution) Non-parametric tests only take into account whether certain scores are higher or lower than other scores ( ranking of data) parametric tests are much more sensitive than nonparametric tests using parametric tests increases the power of your study to find significant effects Experimental Design Repeated vs Non-repeated measures Repeated Same participants in all conditions Within subjects or Related samples Non-repeated Different participants in each condition Between subjects or Independent samples Whether measures are repeated or non-repeated has important implications for comparison of groups Repeated Experimental Design Can compare Joe Blogg s score in condition 1 with his score in condition Non-repeated No basis for comparing one particular score in condition 1 with a particular score in condition Requires very different sets of calculations

Repeated Measures Correct data entry indicates all subjects show a systematic decline Repeated measures design Time 1 Time Non-repeated measures design Subj 1 Subj Subj 3 Subj Subj 5 Subj Subj 7 Degrees of freedom (df) Degrees of freedom (df) is the term used to describe a fundamental (but quite abstract) concept in statistics You have three numbers;, and They have a mean value of You can change any of these numbers, so long as the mean remains the same Unmatched data show no systematic effects (some decrements, but also some improvements!) How many numbers can you change? How many numbers are free to vary? E.g. change the to 7, and the to 13 third number must be if the mean is to remain unchanged Time 1 Time Degrees of freedom always equals the sample size (n) minus one Summary Must describe data before looking for relationships How data is described depends on the type of variable Parametric tests are more powerful than non-parametric tests Repeated measures more powerful than non-repeated measures Large samples more powerful than small samples More powerful experiments will lead to more significant results, but are they really significant in the Real World sense? Considerations for SPSS input Rules of thumb Data coding Defining variables Data cleaning Handout and exercises How powerful we need the inferential tests to be depends on the purposes of the study Never lose sight of the fact that the data come from real people, engaged in real behaviour, in the real world! Rules of Thumb for data entry One row of data for each respondent/participant One column for each variable Enter all the information in raw form Save copy of raw data file in a safe place Transformations can be numerous and fundamentally change nature of data - it is worthwhile keeping an untransformed copy for later reference Once entered, confirm that all the available information is in the spreadsheet Considerations for entering data Type of variable - measurement type: scale, ordinal, nominal height, Likert-type scale response, gender Coding of non-numeric variables e.g.: male = 1 female = missing = 99 Variable value labels especially useful for nominal variables 7

Defining your variables Before we can analyse our data, we must code it for SPSS SPSS can only understand numbers, not words Four types of variable, but SPSS only recognises three: Nominal Ordinal Interval (SPSS Scale ) Ratio (SPSS Scale ) Defining your variables Variable names are limited to characters Use the Variable Label option to provide a more complete description Use the Value Label option to provide details of your coding - the category names will appear in your output (and make the task of interpreting the output a great deal easier) Coding of non-numeric (nominal) variables e.g. male = 1 female = missing = 99 Missing values Respondent fails to answer question didn t understand question recorded more than one response couldn t understand question response doesn t make sense Must define missing values can use 99, -1, 9999 etc. SPSS must ignore these entries must define the missing values for each variable Unexpected values Check data using Frequencies analysis Can spot any unexpected values for a particular variable mistake during data entry E.g. Gender returns values of 1,, 3, and 99 Barchart or Histogram plot will make this task easier May have to recheck data entry from questionnaire Histogram Could a chart help to describe your variables? Histogram Plots frequency distribution for single variable Can spot unexpected values Can see how closely variable resembles normal distribution Can decide whether to use parametric or non-parametric test Measurement & coding of data I really enjoy the feeling of accelerating hard I enjoy showing the Sunday drivers how to really drive Likert-type scale response: 1 = strongly disagree = disagree 3 = neither agree nor disagree = agree 5 = strongly agree I get no thrill from travelling at speed It is completely unimportant who is first away from the lights

Scale construction Confirm all responses within expected range (can use frequencies analysis) How good is each statement at discriminating between positive and negative attitudes towards the issue/object? If everyone records the same score for a particular item, is it useful? How are scores distributed for each item? Plotting a histogram or bar-chart will help Scale construction Add scores to groups of items together to create a scale e.g.: overall attitude towards smoking, drinking etc. Number of individual statements, to which respondents have reported their degree of agreement How to combine these items into a single score for each respondent? Need to create a scale Scale construction Some items will have been counterbalanced Must recode them, so that they all point in the same direction Transform Recode Into Different variables vs Into Same variables Once recoding has been done, simply add scores together to give a total Use Transform...Compute command from main menu See the Handout 9