Summarizing Data. (Ch 1.1, 1.3, , 2.4.3, 2.5)

Similar documents
Introduction to Statistical Data Analysis I

Department of Statistics TEXAS A&M UNIVERSITY STAT 211. Instructor: Keith Hatfield

Data, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics

V. Gathering and Exploring Data

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

Business Statistics Probability

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments

The normal curve and standardisation. Percentiles, z-scores

Business Statistics (ECOE 1302) Spring Semester 2011 Chapter 3 - Numerical Descriptive Measures Solutions

Basic Statistics 01. Describing Data. Special Program: Pre-training 1

Chapter 1: Exploring Data

Still important ideas

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Frequency distributions

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Methodological skills

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Unit 1 Exploring and Understanding Data

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Chapter 1. Picturing Distributions with Graphs

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Observational studies; descriptive statistics

Still important ideas

Averages and Variation

2.4.1 STA-O Assessment 2

AP STATISTICS 2010 SCORING GUIDELINES (Form B)

Quantitative Data and Measurement. POLI 205 Doing Research in Politics. Fall 2015

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

Probability and Statistics. Chapter 1

1) What is the independent variable? What is our Dependent Variable?

Announcement. Homework #2 due next Friday at 5pm. Midterm is in 2 weeks. It will cover everything through the end of next week (week 5).

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

STATISTICS AND RESEARCH DESIGN

Stem-and-Leaf Displays. Example: Binge Drinking. Stem-and-Leaf Displays 1/29/2016. Section 3.2: Displaying Numerical Data: Stem-and-Leaf Displays

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Statistics is a broad mathematical discipline dealing with

Understandable Statistics

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

Undertaking statistical analysis of

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Lesson 9 Presentation and Display of Quantitative Data

C-1: Variables which are measured on a continuous scale are described in terms of three key characteristics central tendency, variability, and shape.

Introduction & Basics

Welcome to OSA Training Statistics Part II

Human-Computer Interaction IS4300. I6 Swing Layout Managers due now

Math 214 REVIEW SHEET EXAM #1 Exam: Wednesday March, 2007

NORTH SOUTH UNIVERSITY TUTORIAL 1

Stats 95. Statistical analysis without compelling presentation is annoying at best and catastrophic at worst. From raw numbers to meaningful pictures

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO

AP Statistics. Semester One Review Part 1 Chapters 1-5

How to interpret scientific & statistical graphs

Analysis and Interpretation of Data Part 1

PRINTABLE VERSION. Quiz 1. True or False: The amount of rainfall in your state last month is an example of continuous data.

MTH 225: Introductory Statistics

Statistics. Nur Hidayanto PSP English Education Dept. SStatistics/Nur Hidayanto PSP/PBI

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13

This is Descriptive Statistics, chapter 12 from the book Psychology Research Methods: Core Skills and Concepts (index.html) (v. 1.0).

9 research designs likely for PSYC 2100

12.1 Inference for Linear Regression. Introduction

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

Descriptive Statistics Lecture

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

AP Statistics TOPIC A - Unit 2 MULTIPLE CHOICE

Statistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.

Statistics: Making Sense of the Numbers

q2_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.

Samples, Sample Size And Sample Error. Research Methodology. How Big Is Big? Estimating Sample Size. Variables. Variables 2/25/2018

3: Summary Statistics

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Section I: Multiple Choice Select the best answer for each question.

Normal Distribution. Many variables are nearly normal, but none are exactly normal Not perfect, but still useful for a variety of problems.

Identify two variables. Classify them as explanatory or response and quantitative or explanatory.

Lesson 1: Distributions and Their Shapes

10/4/2007 MATH 171 Name: Dr. Lunsford Test Points Possible

Before we get started:

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Displaying the Order in a Group of Numbers Using Tables and Graphs

Types of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline

Chapter 1: Explaining Behavior

Previously, when making inferences about the population mean,, we were assuming the following simple conditions:

DOWNLOAD PDF TAKE FULL ADVANTAGE OF DESCRIPTIVE STATISTICS

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

STT315 Chapter 2: Methods for Describing Sets of Data - Part 2

Measurement and Descriptive Statistics. Katie Rommel-Esham Education 604

1. Introduction a. Meaning and Role of Statistics b. Descriptive and inferential Statistics c. Variable and Measurement Scales

Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30%

AP Psych - Stat 1 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

STP226 Brief Class Notes Instructor: Ela Jackiewicz

AP Psych - Stat 2 Name Period Date. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

Lesson 1: Distributions and Their Shapes

PRINCIPLES OF STATISTICS

Instructions and Checklist

Transcription:

1 Summarizing Data (Ch 1.1, 1.3, 1.10-1.13, 2.4.3, 2.5)

Populations and Samples An investigation of some characteristic of a population of interest. Example: You want to study the average GPA of juniors who are engineering majors. Population: All engineering majors who are juniors. Characteristic of interest: Average GPSA. 2

Populations and Samples What statisticians need to do: 1) Learn about the distribution of the characteristic in the population. 2) Do this by taking a sample from the population. Why? 3) Sample statistics. 3

Graphics: Histograms A histogram is a graphical representation of the distribution of numerical data. Construct a histogram: 1. Bin the range of values. (The bins are usually consecutive, non-overlapping, and are usually equal size.) 2. Frequency histogram: count how many values fall into each bin/interval and draw accordingly. 3. Density histogram: count how many values fall into each bin, and adjust the height such that the sum of the area of each bin equals 1. 4

Graphics: Histograms Examples: - Drawing a frequency histogram by hand. - Drawing a density histogram by hand. 5

Example Charity is a big business in the United States. The Web site charitynavigator.com gives information on roughly 5500 charitable organizations. Some charities operate very efficiently, with fundraising and administrative expenses that are only a small percentage of total expenses, whereas others spend a high percentage of what they take in on such activities. 6

Example cont d Here are the data on fundraising expenses as a percentage of total expenditures for a random sample of 60 charities: 6.1 12.6 34.7 1.6 18.8 2.2 3.0 2.2 5.6 3.8 2.2 3.1 1.3 1.1 14.1 4.0 21.0 6.1 1.3 20.4 7.5 3.9 10.1 8.1 19.5 5.2 12.0 15.8 10.4 5.2 6.4 10.8 83.1 3.6 6.2 6.3 16.3 12.7 1.3 0.8 8.8 5.1 3.7 26.3 6.0 48.0 8.2 11.7 7.2 3.9 15.3 16.6 8.8 12.0 4.7 14.7 6.4 17.0 2.5 16.2 7

Example cont d We can see that a substantial majority of the charities in the sample spend less than 20% on fundraising: 8

Graphics: Histograms Histograms come in a variety of shapes. Unimodal histogram: single peak Bimodal histogram: two different peaks Multimodal histogram: many different peaks Bimodality: Can occur when the data set consists of observations on two quite different kinds of individuals or objects. Multimodality Symmetric histograms Positively skewed histograms Negatively skewed histograms 9

Sample Statistics Histograms and other visual summaries of samples are excellent tools for informal learning about population characteristics. The calculation and interpretation of certain summarizing numbers are required for a deeper understanding of the data. These sample numerical summaries are called Sample Statistics 10

Sample Statistics: Measures of Centrality Summarizing the center of the sample data is a popular and important characteristic of a set of numbers. 3 popular types of center: 1. Mean 2. Median 3. Mode 11

The Sample Mean For a given set of numbers x 1, x 2,..., x n, the most familiar measure of the center is the mean (arithmetic average). Sample mean x of observations x 1, x 2,..., x n : 12

The Sample Mean For a given set of numbers x 1, x 2,..., x n, the most familiar measure of the center is the mean (arithmetic average). Sample mean x of observations x 1, x 2,..., x n : Disadvantage? 13

The Sample Median Median: Middle value when observations are ordered smallest to largest. 14

The Sample Median Median: Middle value when observations are ordered smallest to largest. To calculate: Order the n observations smallest to largest (repeated values included and find the middle one. 15

The Mean vs. the Median The population mean µ and median will not generally be identical. If the population distribution is positively or negatively skewed, as pictured below, then (a) Negative skew (b) Symmetric (c) Positive skew Three different shapes for a population distribution Which population characteristic is most important? 16

The Mean vs. the Median The population mean µ and median will not generally be identical. If the population distribution is positively or negatively skewed, as pictured below, then (a) Negative skew (b) Symmetric (c) Positive skew Three different shapes for a population distribution 17

Other Sample Measures Quartiles: divide the data set into four equal parts (how is this calculated?) Percentiles: A data set can be even more finely divided. What does percentile mean? Example calculations of the median and quartiles. 18

Graphics: Boxplots A boxplot is a convenient way of graphically depicting groups of numerical data through the five number summary: minimum, first quartile, median, third quartile, and maximum. Example: Drawing a boxplot by hand. 19

Variability So far, we ve learned techniques for visualizing our data and measures of center. What about how far apart the data is spread out? Samples with identical measures of center but different amounts of variability 20

Variability Simplest measure of variability: The range. Samples with identical measures of center but different amounts of variability 21

Variability Simplest measure of variability: The range. Samples with identical measures of center but different amounts of variability What are the disadvantages of the range? 22

Variability Can we combine the deviations into a single quantity by finding the average deviation? A more robust measure of variation takes into account deviations from the mean 23

Variability The sample variance, denoted by s 2, is given by The sample standard deviation, denoted by s, is the (positive) square root of the variance: Note that s 2 and s are both nonnegative. The unit for s is the same as the unit for each of the x i. Example: Calculation of the SD. 24

Summarizing Data in R - Summary statistics - Graphics (boxplots, histograms) 25