An Introduction to Research Statistics

An Introduction to Research Statistics An Introduction to Research Statistics Cris Burgess Statistics are like a lamppost to a drunken man - more for leaning on than illumination David Brent (alias Ricky Gervais The Office ) An Introduction to Statistics 9:3-9:5 Introduction 9:5-11:3 Conceptual issues/data entry 11:3-11:5 Coffee/Tea 11:5-1: SPSS demonstration/practical session 1 Course Philosophy Statistics don t have to be frightening Statistics can be simple and straight-forward Use familiar concepts to explain unfamiliar ideas 1: - : : - 3:3 Lunch Tests of frequency and difference Statistics are a useful tool, used with discretion Basic concepts behind common statistical analyses 3:3-3:5 Coffee/Tea 3:5-5: SPSS demonstration/practical session No Greek symbols, minimal equations Statistics/SPSS Useful References Field, A. () Discovering Statistics using SPSS for Windows. London: Sage. Greene, J. & D Oliveira, M. (19, 199) Learning to use statistical tests in psychology. Buckingham: OUP. Pallant, J. (1) SPSS Survival Manual. Buckingham: OUP. Experimental considerations Harris, P. () Designing and Reporting Experiments in Psychology. Buckingham: OUP. Session One: Conceptual Issues Types of variable Function of statistics - what do they describe? Descriptive versus Inferential statistics Frequency distributions Statistical significance Power and effect size Statistical versus Real World significance Sampling issues Measurement issues Considerations for SPSS entry 1

Types of variable SPSS can calculate all these descriptives for you Before we can analyse any data, we need to code it for the computer SPSS can only understand numbers! Four different kinds of variable: Nominal (unordered category) Ordinal (ordered category) Interval (scale) Ratio (scale) Nominal variable Unordered category variable Uses numbers as names for categories Size of number tells us nothing about the differences between categories eg: 1 = male = female or numbers on buses Ordinal variable Interval (scale) variable Ordered category variable Order of categories in data Number tells us that 1 is more than eg: 1 = Gold = Silver 3 = Bronze...or order buses are listed on timetable Continuous, scale variable Equal differences in the numbers indicate equal differences in the size of the attribute being measured Measure has no true zero eg: temperature (centigrade scale) or time of day for buses arrival Ratio (scale) variable Summary Continuous, scale variable Equal differences in the numbers indicate equal differences in the size of the attribute being measured True zero exists eg: height (centimetres) weight (kilograms) or time between each bus Nominal unordered category, name Ordinal ordered category, order Interval scale, no true zero Ratio scale, true zero

So, types of variable Variable Type Example Home town Nominal 1=Exeter, =Taunton, 3=Truro Nearest town Ordinal 1=Exeter, =Taunton, 3=Truro Time left home Interval 11.75=11:5am, 15.5=3:3pm Journey time Ratio. hours, 5.5 hours Types of statistics Descriptive statistics Summarise the data ( shorthand ) Describe the symptoms Inferential statistics Look at relationships within the data Draw inferences Establish the possible causes Types of descriptive statistics Descriptive statistics Average scores Mode (most frequent category) Median (score which splits sample 5:5) Mean ( average ) Measures of dispersion or spread Range (minimum to maximum) Inter-quartile range (central 5% of sample) Standard deviation The central tendency measure is clearly useful, but why is dispersion important? Two distributions, same mean, different standard deviation (or range) Large s.d. platykurtic Frequency Frequency Mean R.T. (ms) Reaction time (ms) Small s.d. leptokurtic Reaction time (ms) Descriptive statistics Which descriptive statistics make sense? Make of car? Degree classification? Temperature? Reaction time? Which don t make any sense? Remember, the numbers in the spreadsheet are real data from real people answering real questions Your statistics should make sense in the relevant experimental context Frequency 1 1 1 1 yr 9 yr yr 7 yr yr 5 yr yr 3 yr yr 1 yr Frequency distributions "The Sun" readership (mental age distribution) How many scores in each category Represents data in visual form Normal distribution (parametric) Age category 3

Frequency distributions Normally distributed? Parametric test Normally-distributed data only e.g. reaction time (in ms) 9 7 5 3 1 1 3 5 7 Normally distributed? The Normal distribution Non-Parametric test Data that is not normally-distributed e.g. number of car crashes in driving career 7 5 3 1 3 5 7 Can increase power of experiment to detect effect Allows us to make assumptions about the data Average values are common Extreme values are rare Must be: Interval or ratio data able to take wide range of values accurately measured Sampling Issues Sampling Issues Reaction time distribution Not necessary for sample to be normally distributed in order to justify using a parametric test Frequency 3 5 15 5 5 N 3 5 1 Variable must be normally distributed in general population In other words, sample must be taken from a population that reflects a parametric (normal) distribution for that particular variable Reaction time (ms)

Types of statistics Descriptive statistics Summarise the data ( shorthand ) Describe the symptoms Inferential statistics Look at relationships within the data Draw inferences Establish the possible causes Indicate how significant the results are What s the answer? What s the question? Inferential statistics May sound like stating the bleeding obvious, but Establishing the specific research question to test is the hardest thing in research What is the relationship between A and B? How are A and B measured? How can differences/similarities be tested? Inferential statistics How can we compare variable scores against one another? Frequency - compares membership of categories (nominal or ordinal) e.g. Chi-square test Differences - compares groups in terms of a score (interval or ratio) e.g. T-test, Analysis of Variance (ANOVA) Association - compares two (or more) variables (ordinal, interval or ratio) e.g. Correlation, Regression, Factor analysis Statistical significance Is the observed effect likely to happen by chance, or is something else responsible? Coin tossing: Heads 5 out of out of 3 out of out of out of 5 out of Statistical significance Coin tossing: Heads: 5 out of even chance 5 out of trick coin? We choose an arbitrary point (5%), beyond which the observed effect is not considered to be due to chance Hence, it must be due to some other factor (i.e. our manipulation) 5 out of = 5% probability 9 out of = less than 5% probability p < 5% or p <.5 Effect Size and Power Effect size magnitude of difference between conditions difference IV makes to DV scores estimate effect size from previous research How bright are the stars? Power capacity to correctly reject null hypothesis sensitivity to relevant effect sizes ability to detect effect How powerful is your telescope? 5

Task 1: Purpose: Effect size: Power: Task : Purpose: Effect size: Power: Effect Size and Power How bright is bright enough? to map the stars to draw constellations large (major stars only required) can be relatively low-powered to map the stars to guide space probe to Alpha Centauri small (major and minor stars required) must be relatively high-powered Effectively the same as deciding the level of statistical significance Real World significance Increasing sample size increases statistical significance Changing the test type can increase statistical significance Is the difference really significant? Does it make a rational difference in the context of the real world setting? Remember, the statistics reflect the effect of real variables on real people Power Power of an experiment to detect relevant effects depends on five main features size of sample significance level one or two-tailed test between or within subjects measure parametric or non-parametric test Parametric vs Non-parametric Parametric tests calculate exact numerical differences between scores (and assume normal distribution) Non-parametric tests only take into account whether certain scores are higher or lower than other scores ( ranking of data) parametric tests are much more sensitive than nonparametric tests using parametric tests increases the power of your study to find significant effects Experimental Design Repeated vs Non-repeated measures Repeated Same participants in all conditions Within subjects or Related samples Non-repeated Different participants in each condition Between subjects or Independent samples Whether measures are repeated or non-repeated has important implications for comparison of groups Repeated Experimental Design Can compare Joe Blogg s score in condition 1 with his score in condition Non-repeated No basis for comparing one particular score in condition 1 with a particular score in condition Requires very different sets of calculations

Repeated Measures Correct data entry indicates all subjects show a systematic decline Repeated measures design Time 1 Time Non-repeated measures design Subj 1 Subj Subj 3 Subj Subj 5 Subj Subj 7 Degrees of freedom (df) Degrees of freedom (df) is the term used to describe a fundamental (but quite abstract) concept in statistics You have three numbers;, and They have a mean value of You can change any of these numbers, so long as the mean remains the same Unmatched data show no systematic effects (some decrements, but also some improvements!) How many numbers can you change? How many numbers are free to vary? E.g. change the to 7, and the to 13 third number must be if the mean is to remain unchanged Time 1 Time Degrees of freedom always equals the sample size (n) minus one Summary Must describe data before looking for relationships How data is described depends on the type of variable Parametric tests are more powerful than non-parametric tests Repeated measures more powerful than non-repeated measures Large samples more powerful than small samples More powerful experiments will lead to more significant results, but are they really significant in the Real World sense? Considerations for SPSS input Rules of thumb Data coding Defining variables Data cleaning Handout and exercises How powerful we need the inferential tests to be depends on the purposes of the study Never lose sight of the fact that the data come from real people, engaged in real behaviour, in the real world! Rules of Thumb for data entry One row of data for each respondent/participant One column for each variable Enter all the information in raw form Save copy of raw data file in a safe place Transformations can be numerous and fundamentally change nature of data - it is worthwhile keeping an untransformed copy for later reference Once entered, confirm that all the available information is in the spreadsheet Considerations for entering data Type of variable - measurement type: scale, ordinal, nominal height, Likert-type scale response, gender Coding of non-numeric variables e.g.: male = 1 female = missing = 99 Variable value labels especially useful for nominal variables 7

Defining your variables Before we can analyse our data, we must code it for SPSS SPSS can only understand numbers, not words Four types of variable, but SPSS only recognises three: Nominal Ordinal Interval (SPSS Scale ) Ratio (SPSS Scale ) Defining your variables Variable names are limited to characters Use the Variable Label option to provide a more complete description Use the Value Label option to provide details of your coding - the category names will appear in your output (and make the task of interpreting the output a great deal easier) Coding of non-numeric (nominal) variables e.g. male = 1 female = missing = 99 Missing values Respondent fails to answer question didn t understand question recorded more than one response couldn t understand question response doesn t make sense Must define missing values can use 99, -1, 9999 etc. SPSS must ignore these entries must define the missing values for each variable Unexpected values Check data using Frequencies analysis Can spot any unexpected values for a particular variable mistake during data entry E.g. Gender returns values of 1,, 3, and 99 Barchart or Histogram plot will make this task easier May have to recheck data entry from questionnaire Histogram Could a chart help to describe your variables? Histogram Plots frequency distribution for single variable Can spot unexpected values Can see how closely variable resembles normal distribution Can decide whether to use parametric or non-parametric test Measurement & coding of data I really enjoy the feeling of accelerating hard I enjoy showing the Sunday drivers how to really drive Likert-type scale response: 1 = strongly disagree = disagree 3 = neither agree nor disagree = agree 5 = strongly agree I get no thrill from travelling at speed It is completely unimportant who is first away from the lights

Scale construction Confirm all responses within expected range (can use frequencies analysis) How good is each statement at discriminating between positive and negative attitudes towards the issue/object? If everyone records the same score for a particular item, is it useful? How are scores distributed for each item? Plotting a histogram or bar-chart will help Scale construction Add scores to groups of items together to create a scale e.g.: overall attitude towards smoking, drinking etc. Number of individual statements, to which respondents have reported their degree of agreement How to combine these items into a single score for each respondent? Need to create a scale Scale construction Some items will have been counterbalanced Must recode them, so that they all point in the same direction Transform Recode Into Different variables vs Into Same variables Once recoding has been done, simply add scores together to give a total Use Transform...Compute command from main menu See the Handout 9