Announcement. Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5).

Similar documents
Political Science 15, Winter 2014 Final Review

Business Statistics Probability

Still important ideas

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Still important ideas

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

bivariate analysis: The statistical analysis of the relationship between two variables.

Homework #2 is due next Friday at 5pm.

SOME NOTES ON STATISTICAL INTERPRETATION

Chapter Eight: Multivariate Analysis

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics as a Tool. A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Chapter Eight: Multivariate Analysis

Quantitative Methods in Computing Education Research (A brief overview tips and techniques)

POLS 5377 Scope & Method of Political Science. Correlation within SPSS. Key Questions: How to compute and interpret the following measures in SPSS

Elementary Statistics:

Undertaking statistical analysis of

Understandable Statistics

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017

Lecture (chapter 1): Introduction

Distributions and Samples. Clicker Question. Review

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

Student name: SOCI 420 Advanced Methods of Social Research Fall 2017

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

STATISTICS AND RESEARCH DESIGN

AP Psych - Stat 1 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Survey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.

Survey research (Lecture 1)

Analysis and Interpretation of Data Part 1

Unit 1 Exploring and Understanding Data

POL 242Y Final Test (Take Home) Name

Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0

Descriptive Statistics Lecture

Unit 7 Comparisons and Relationships

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

AP Psych - Stat 2 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Homework Exercises for PSYC 3330: Statistics for the Behavioral Sciences

Introduction to Statistical Data Analysis I

Quantitative Data and Measurement. POLI 205 Doing Research in Politics. Fall 2015

On the purpose of testing:

AP Statistics. Semester One Review Part 1 Chapters 1-5

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

QPM Lab 9: Contingency Tables and Bivariate Displays in R

9 research designs likely for PSYC 2100

What type of graph should I use?

Department of Statistics TEXAS A&M UNIVERSITY STAT 211. Instructor: Keith Hatfield

Summarizing Data. (Ch 1.1, 1.3, , 2.4.3, 2.5)

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Psychology Research Process

STAT 503X Case Study 1: Restaurant Tipping

Chapter 1: Exploring Data

Chapter 2--Norms and Basic Statistics for Testing

PRINCIPLES OF STATISTICS

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape.

MTH 225: Introductory Statistics

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

STATISTICS & PROBABILITY

Variability. After reading this chapter, you should be able to do the following:

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Collecting & Making Sense of

Psychology Research Process

7. Bivariate Graphing

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of

CHAPTER ONE CORRELATION

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

1. Introduction a. Meaning and Role of Statistics b. Descriptive and inferential Statistics c. Variable and Measurement Scales

STT315 Chapter 2: Methods for Describing Sets of Data - Part 2

Descriptive Research a systematic, objective observation of people.

CHAPTER 3 Describing Relationships

Measuring the User Experience

Biostatistics for Med Students. Lecture 1

Human-Computer Interaction IS4300. I6 Swing Layout Managers due now

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Chapter 2: The Organization and Graphic Presentation of Data Test Bank

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Business Statistics (ECOE 1302) Spring Semester 2011 Chapter 3 - Numerical Descriptive Measures Solutions

Here are the various choices. All of them are found in the Analyze menu in SPSS, under the sub-menu for Descriptive Statistics :

Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30%

Key findings from a telephone survey of 200 registered voters in New Orleans, Louisiana, with 60 cell phone interviews, conducted December 16-18, 2014

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

Basic Statistics 01. Describing Data. Special Program: Pre-training 1

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

MM207 Mid-Term Project

Averages and Variation

Statistical Methods Exam I Review

Lesson 9 Presentation and Display of Quantitative Data

Research Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015

Intro to SPSS. Using SPSS through WebFAS

Statistics. Nur Hidayanto PSP English Education Dept. SStatistics/Nur Hidayanto PSP/PBI

Transcription:

Announcement Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5).

Political Science 15 Lecture 8: Descriptive Statistics (Part 1)

Data Coding Coding is the process of assigning numerical values to the values of your variable. The meaning of these codes will depend on the level of measurement of the variable: Nominal: codes are just indications of the category Ordinal: codes are indications of ordering Interval/Ratio: codes are the actual numerical value

Preparing Data for Hypothesis Testing Gather measurements on all of the concepts important for your hypothesis (dependent, independent, and control variables). Enter them into a spreadsheet. We will use SPSS in this class. Each row is an observation (unit), each column is a variable.

Example of Data Ready for Hypothesis Testing Interview # Religion Income Ideology 1 1 35000 4 2 1 46000 3 3 3 82000 5 4 2 19000 2 5 1 67000 6 We use a codebook to find out what these numbers mean.

Descriptive Statistics Descriptive statistics can be used for descriptive inference using data to learn something about the state of the world. These descriptive statistics will also be the building blocks we use for causal inference testing our hypotheses with data to learn something about how the world works. We begin with descriptive statistics for a single variable.

Understanding Our Data Before undertaking any data analysis you should examine your data carefully. Watch for unusual distributions of variable values and outliers in the data. An outlier is an extreme value on a variable. Try to determine why you have observed this value. An unusual case? A coding error?

Example of an outlier affecting a relationship

Exploring Data: Frequency Distributions Divide the variable into a set of exhaustive, mutually exclusive categories. Example: Cumulative Ideology # of people Percent Percent Conservative 300 30% 30% Moderate 500 50% 80% Liberal 200 20% 100% Total 1000 100% 100%

Exploring Data: Graphical Methods For nominal and ordinal level data bar graphs work well:

Exploring Data: Graphical Methods For interval level data a histogram is useful (note detection of outlier):

Central Tendency: Mode The mode is the category of a variable with the greatest frequency of observations. The mode is most commonly used on variables with a nominal level of measurement. There can be more than one modal value for a variable. Variables with more than one mode are referred to as bimodal or multimodal. Example: In a party ID variable we have 40 Democrats, 60 Republicans, and 20 Independents the mode is Republican.

Central Tendency: Median The median is the value of a variable that divides the observations on that variable in half. If we ordered our observations on a variable from lowest to highest, the median observation is the one in the middle. With an even number of observations there is no true median. The median is most commonly used on variables with an ordinal level of measurement, but is sometimes used on interval/ratio data because it is resistant to outliers.

Example of Calculating Median We have a 7-point scale on ideology in a survey: Category: 1 2 3 4 5 6 7 # responses: 32 54 97 103 44 21 12 The median observation is observation (N+1)/2 = 182. Count up from the lowest value median is 3.

Quartiles If we arrange a variable from lowest to highest value, the median is the observation at the 50% mark. Quartiles are at the 25%, 50% and 75% marks. Quintiles: 20%, 40%, 60%, 80% Deciles: every 10% Percentiles: every 1% We can use these to get a more detailed picture of the distribution of a variable.

Central Tendency: Mean The mean is the sum of the values of a variable divided by the number of observations on that variable. This is usually what people mean by average. The formula for the mean is written as: The mean is most commonly used on variables with an interval level of measurement.

Example of Calculating Mean We have campaign spending in 7 districts: District: 1 2 3 4 5 6 7 $ spent: 1000 5000 3500 2000 0 800 6000 ΣX = 1000 + 5000 + 3500 + 2000 + 0 + 800 + 6000 = 18300. N = 7 The mean is 18300/7 = 2614.

Central Tendencies in Global Income Distribution

Dispersion: Standard Deviation The variance of a variable is the sum of the squared differences between each value of that variable and the mean, divided by N 1. We square the differences so that positive and negative differences don t cancel out. We divide by N 1 to get a (conservative) estimate of the mean dispersion of the variable. The square root of the variance is the standard deviation:

Example of Calculating Standard Deviation We have campaign spending in 7 districts: District: 1 2 3 4 5 6 7 $ spent: 1000 5000 3500 2000 0 800 6000 Mean of variable is 2614. s = square root of [1/6 ((1000 2614) 2 + (5000 2614) 2 + ))] The standard deviation is 2106.

z scores A z score is a measure of how many standard deviations a particular observation is above or below the mean. We subtract the mean from the observation and divide by the standard deviation.

Example of Calculating z scores We have campaign spending in 7 districts: District: 1 2 3 4 5 6 7 $ spent: 1000 5000 3500 2000 0 800 6000 Mean of variable is 2614 Standard deviation of variable is 2106. z score for district 1 is (1000 2614)/2106 = -0.77 z score for district 2 is (5000 2614)/2106 = 1.13

Descriptive Statistics for Relationships Between Variables These are the more interesting descriptive statistics from our perspective, since we are interested in testing causal relationships between variables. Our hypothesis tests later in the class will usually be based on these calculations. As with a single variable, we begin by exploring our data to be sure we understand it.

Exploring Data: Bivariate Frequency Distributions Divide the variables into a set of exhaustive, mutually exclusive categories. Example: Favors Gas Tax Opposes Gas Tax Party ID Democrat 50% N=500 10% N=100 Party ID Republican 10% N=100 30% N=300

Examples of Relationships in Crosstabs Dem Rep Yes 25% 25% No 25% 25% Dem Rep Yes 40% 10% No 10% 40% Dem Rep Yes 10% 40% No 40% 10% No Yes No Our hypothesis is that Democrats are more supportive of a gas tax. Do our data support this?

Exploring Data: Graphical Methods For interval level data scatterplots are a good way to examine relationships between variables :

Correlations Correlations measure the relationship between two interval level variables. Correlations always fall between 1 and 1. Positive correlations indicate a positive relationship, negative correlations indicate a negative relationship. No relationship gives a 0 correlation, but 0 correlation does not necessarily mean no relationship. Correlations only capture linear relationships: y = a + b*x

Positive Correlations Stronger Weaker

Negative Correlations Stronger Weaker

Examples of Correlations