Introduction to Statistics

Similar documents
Math 124: Module 3 and Module 4

Math 124: Modules 3 and 4. Sampling. Designing. Studies. Studies. Experimental Studies Surveys. Math 124: Modules 3 and 4. Sampling.

Sampling Reminders about content and communications:

MATH-134. Experimental Design

Unit 1 Exploring and Understanding Data

Class 1. b. Sampling a total of 100 Californians, where individuals are randomly selected from each major ethnic group.

Data = collections of observations, measurements, gender, survey responses etc. Sample = collection of some members (a subset) of the population

Principle underlying all of statistics

Experimental Design There is no recovery from poorly collected data!

Bias in Sampling. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

Statistics Success Stories and Cautionary Tales

Probability and Statistics. Chapter 1

Two Branches Of Statistics

Study Design. Study design. Patrick Breheny. January 23. Patrick Breheny Introduction to Biostatistics (171:161) 1/34

Sampling Controlled experiments Summary. Study design. Patrick Breheny. January 22. Patrick Breheny Introduction to Biostatistics (BIOS 4120) 1/34

Chapter 1: Exploring Data

Introduction. Lecture 1. What is Statistics?

Confidence Intervals and Sampling Design. Lecture Notes VI

People have used random sampling for a long time

Unit 7 Comparisons and Relationships

AP Statistics Exam Review: Strand 2: Sampling and Experimentation Date:

Chapter 1: Exploring Data

AP Stats Review for Midterm

Section 6.1 Sampling. Population each element (or person) from the set of observations that can be made (entire group)

Problems for Chapter 8: Producing Data: Sampling. STAT Fall 2015.

Chapter 2. The Data Analysis Process and Collecting Data Sensibly. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Variable Data univariate data set bivariate data set multivariate data set categorical qualitative numerical quantitative

BIAS: The design of a statistical study shows bias if it systematically favors certain outcomes.

Sampling and Data Collection

Math 140 Introductory Statistics

Objectives. Data Collection 8/25/2017. Section 1-3. Identify the five basic sample techniques

Name AP Statistics UNIT 1 Summer Work Section II: Notes Analyzing Categorical Data

Chapter 1 Data Collection

Unit 3: Collecting Data. Observational Study Experimental Study Sampling Bias Types of Sampling

CHAPTER 2 SOCIOLOGICAL RESEARCH METHODS

Distributions and Samples. Clicker Question. Review

Observational study is a poor way to gauge the effect of an intervention. When looking for cause effect relationships you MUST have an experiment.

Introduction; Study design

STATISTICS & PROBABILITY

Chapter 13 Summary Experiments and Observational Studies

Chapter 13. Experiments and Observational Studies. Copyright 2012, 2008, 2005 Pearson Education, Inc.

UNIT I SAMPLING AND EXPERIMENTATION: PLANNING AND CONDUCTING A STUDY (Chapter 4)

MATH& 146 Lesson 4. Section 1.3 Study Beginnings

Chapter 4 Review. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Ch. 1 Collecting and Displaying Data

august 3, 2018 What do you think would have happened if we had time to do the same activity but with a sample size of 10?

Review+Practice. May 30, 2012

Sampling for Success. Dr. Jim Mirabella President, Mirabella Research Services, Inc. Professor of Research & Statistics

UNIT 1 EXAM REVIEW (Topics 1-5)

Chapter 3. Producing Data

Paper Airplanes & Scientific Methods

Overview: Part I. December 3, Basics Sources of data Sample surveys Experiments

Population. population. parameter. Census versus Sample. Statistic. sample. statistic. Parameter. Population. Example: Census.

Political Science 15, Winter 2014 Final Review

Math 124: Module 2, Part II

Moore, IPS 6e Chapter 03

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Observational studies; descriptive statistics

Vocabulary. Bias. Blinding. Block. Cluster sample

Understandable Statistics

How to interpret scientific & statistical graphs

Creative Commons Attribution-NonCommercial-Share Alike License

Patrick Breheny. January 28

REVIEW FOR THE PREVIOUS LECTURE

Chapter 1. Picturing Distributions with Graphs

Chapter 3. Producing Data

Section 1: Exploring Data

GCSE PSYCHOLOGY UNIT 2 FURTHER RESEARCH METHODS

Chapter 7: Descriptive Statistics

q2_2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

How are polls conducted? by Frank Newport, Lydia Saad, David Moore from Where America Stands, 1997 John Wiley & Sons, Inc.

Student Performance Q&A:

Section 1.2 Displaying Quantitative Data with Graphs. Dotplots

Section 6.1 Sampling. Population each element (or person) from the set of observations that can be made (entire group)

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

Ch 1.1 & 1.2 Basic Definitions for Statistics

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

Statistics: Interpreting Data and Making Predictions. Interpreting Data 1/50

Gathering. Useful Data. Chapter 3. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Module 4 Introduction

CHAPTER 5: PRODUCING DATA

aps/stone U0 d14 review d2 teacher notes 9/14/17 obj: review Opener: I have- who has

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

STAT 110: Introduction to Descriptive Statistics

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

Our goal in this section is to explain a few more concepts about experiments. Don t be concerned with the details.

Observation Studies, Sampling Designs and Bias

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments

Lesson 9 Presentation and Display of Quantitative Data

Ch 4 Practice Test. Multiple Choice Identify the choice that best completes the statement or answers the question. Scenario 4-1

Introduction to Research Methods

CHAPTER 1 SAMPLING AND DATA

Examining Relationships Least-squares regression. Sections 2.3

Measuring the User Experience

P. 266 #9, 11. p. 289 # 4, 6 11, 14, 17

Chapter 1: Collecting Data, Bias and Experimental Design

Sampling. (James Madison University) January 9, / 13

full file at

Transcription:

Introduction to Statistics Topics 1-5 Nellie Hedrick Statistics Statistics is the Study of Data, it is science of reasoning from data. What does it mean by the term data? You will find that data vary and variability abounds in everyday life. Observational unit are the objects described by a set of data. Variability phenomenon of a variable taking on different values or categories from observational unit to observational unit. Quantitative Variables take the numerical values which numerical operation makes sense. Such as height, weight, time, Categorical Variables places an individual into one of several group or categories. Such as gender, cities in Oklahoma, states in USA, Binary variables categorical variable that can only take two possible outcome. Male/female, Yes/No, Research Question often looks for patterns in a variable or compares a variable across different groups or looks for a relationship between variables 1

More on Observational Units and Variables: Distinction between categorical and quantitative variables is very important determines which statistical tools to use for analyzing a given data set. Determine if data measured either quantitatively or categorically How many hours you slept in the past 24-hours Whether you have slept for at least 7 hours in the past 24-hours Determine a variable that takes numerical values that are really just category labels, such as zip-code, Watch out: to determine whether something is actually a variable, ask yourself whether or not it represents a question that can be asked of each observational unit and Whether the values can potentially vary from observational unit to observational unit. More on Wrap up - Statistics is the science of data Data are not mere numbers Data are collected with purpose and have meaning in some context Fundamental concept of statistics is variability As we go through the course you will understand to classify variables and determine which statistical tools to apply to the data Always consider data in context and anticipate reasonable values for the data collected and analyzed Variable is characteristic that varies from one person to another (observational unit) Identify variables as categorical, quantitative or binary 2

Activity 1-6 page 11 Activity 1-9 page 12 Activity 1-13 page 12 Topic 2 Data and Distributions and the Graphing Calculator Picturing Distributions with Graph The Distribution of a variable tells us what values it takes and how often it takes these values. We are looking for pattern of variation. Categorical Variables places an individual into one of several group or categories. Quantitative Variables take the numerical values which numerical operation makes sense Distribution of a variable what values it takes and how often,. 3

Graphical Representations of Data Categorical Variable Percent of students 100 90 80 70 60 50 40 30 20 Distribution of Percent of Students Attended Training by Class Bar Chart Class Frequency (%) Freshman 14.3 Sophomore 42.9 Junior 7.1 Senior 35.7 100 10 0 Freshman Sophomore Junior Senior Activity 2-2 hand washing (page 17) In August 2005, researchers for the American Society for Microbiology and the Soap and Detergent Association monitored the behavior of more than 6300 users of public restrooms. They observed people in public venues such as Turner Field in Atlanta and Grand Central Station in New York City. They found that 2393 of 3206 men washed their hands, compared to 2802 of 3130 women. a. What proportion of the men washed their hands? What proportion of the women washed their hands? b. Are these proportions consistent with the following pair of bar graphs? c. Comment on what your calculations and the bar graph reveal about whether or not one gender is more likely to wash their hands after using a public restroom. d. For each city, estimate the proportion of people who washed their hands as accurately as you can from the graph. Atlanta: Chicago: New York: San Francisco: e. Comment on what the bar graphs reveal about how these cities compare with regard to hand washing. 4

Activity 2-2 hand washing (page 17) Studying people washing their hand after using restroom We can look at % of all data collected whether or not they are washing their hands Look at variation between men and women Variation between people in different state whether or not washing their hands Variation between men and women in each state washing their hand Activity 2-4: Buckle Up (page 19) The National Highway Traffic Safety Administration ( NHTSA) reports the percentage of residents in each state who regularly wear a seatbelt in a car and also whether or not the state has a primary or secondary type of seatbelt law. A primary law means that motorists can be stopped based solely on belt usage, while a secondary law means that the motorist can be stopped only for another reason. The 2005 data appear in the next table ( s secondary, p primary, and * not known): a. What are the observational units for these data? b. Classify each of the variables in the table as categorical ( also binary) or quantitative. c. What would you estimate is a typical usage percentage for a state with a primary- type seatbelt law? How about a state with a secondary- type law? ( Do not perform any calculations; base your answers on a casual reading of the dotplots.) Primary: Secondary: a. Does a state with a primary law always have a higher usage percentage than a state with a secondary law? Explain. If not, identify a pair of states for which the state with a primary law has a lower usage percentage than the state with a secondary law. b. Do states with a primary law tend to have higher usage percentages than states with a secondary law? Explain how you can tell from the dotplots. c. Do the data seem to support the contention that tougher ( primary) laws lead to more seatbelt usage? Can you draw this conclusion definitively? Explain. 5

Activity 2-4: Buckle Up (page 19) What type of variable? Create visual display DOTPLOT, useful method for displaying small datasets of quantitative variable Label the axis, specially if more than one group Bar or dot plot usually more illuminating when we are comparing the distribution of variables between two or more groups Statistical tendency- when comparing 2 or more groups or analyzing dataset Use words like tend to, on average, lead to in order to express the results. Watch out and In Brief Bar or dot plot usually more illuminating when we are comparing the distribution of variables between two or more groups Statistical tendency- when comparing 2 or more groups or analyzing dataset. But it is not a hard-and-fast rule for categorical and quantitative variables. Be careful with your language. This is also true for cause-and-effect conclusions. Label your graphs Be careful, when it is asked proportion(0-1) or percent(0% - 100%) Bar graph are easier to compare than comparing raw data. Always relate your comments to the context of the data and ideally to the question of the interest. 6

Watch out and Wrap up continued Consistency refer to how variable or spread out, the values in a data sets are for a quantitative variables. When describing a distribution refer to both center (tendency) and spread (consistency) Exercises 2-9 page 27 Exercises 2-12 page 28 Exercises 2-16 page 30 7

Topic 3: Drawing Conclusions from Studies Data gives you insight into interesting questions. Idea of generalizing the results of the study to a larger group than those you used in the study itself. Population in a study refers to the entire group of people or objects (observational unit) of interest Sample is typically small part of the population from whom or about what data are gathered to learn about the population. If sample is selected carefully (representative of the population) you can learn a lot about the population. Sample size the number of observational units (people or objects) studied in a sample. Sampling Bias sampling procedures if it tends systematically to over represent certain segments of the population and under represents others. More Definition Activity 3-1 page 35 Convenience samples sample selected due to convenience of being available. Voluntary response sample selected in a such a way that members of the population decide for themselves whether or not to be part of the study. Non-response problem could rise when the observational unit does not respond to the study Sampling frame list used to select the subjects does not represent all variation in the population Parameter number that describe the population (P-P) Statistics number that describe the sample (S-S) 8

Activity 3-1 page 34 Elvis Presley is reported to have died in his Graceland mansion on August 16, 1977. On the 12th anniversary of this event, a Dallas record company wanted to learn the opinions of all adult Americans on the issue of whether Elvis was really dead. But of course they could not ask every adult American this question, so they sponsored a national call- in survey. Listeners of more than 100 radio stations were asked to call a 1-900 number ( at a charge of $ 2.50) to voice an opinion concerning whether Elvis was really dead. It turned out that 56% of the callers thought that Elvis was alive. This scenario is very common in statistics: wanting to learn about a large group based on data from a smaller group. Activity 3-1 page 34 (cont) In 1936, Literary Digest magazine conducted the most extensive ( to that date) public opinion poll in history. They mailed out questionnaires to over 10 million people whose names and addresses they had obtained from telephone books and vehicle registration lists. More than 2.4 million people responded, with 57% indicating they would vote for Republican Alf Landon in the upcoming presidential election. ( Incumbent Democrat Franklin Roosevelt won the actual election, carrying 63% of the popular vote.) 9

More Definition Activity 3-4 page 39 Explanatory variable The variable whose effect you want to study. Response variable The variable that you suspect is effected by the other variable, explanatory variable Observational Study when researcher passively observe and record information about observational units. Lurking variables when observational does not includes the possible effects of a variable. Unrecorded variable is called lurking variable. Confounding variable is a lurking variable whose effects on the response variable indistinguishable from the effects of the explanatory variable. Activity 3-4 page 39 Exercise 3-8 page 46 10

Wrap Up: Key questions to consider What are the two things can prevent you from drawing certain conclusion in the study? Bias and compounding To what population can you reasonably generalize the results of a study? Depends to how you have selected your data Can you reasonably draw a cause-and-effect connection between the explanatory and response variables? Depends on whether or not explanatory variable was assigned to the observational units Topic 4 Random Sampling One way to avoid a biased sampling method is to give every member of the population the same chance of being selected for the sample. Your selection method should ensure that every possible sample (of the desired sample size) has an equal chance of being the sample ultimately selected. Such a sampling is called Simple Random Sampling (SRS) Unbiased A statistic is said to provide unbiased estimates of a population parameter if values of the statistics from different random samples are centered at the actual parameter value 11

Definition Sampling variability an important statistical property knows as sampling variability refers to the fact the values of sample statistics vary from sample to sample. Precision of a sample statistics refers to how much the values vary from sample to sample The bigger the sample size the more precise and closer together than those with the smaller sample size Statistics provides more accurate estimate of the corresponding population parameter Activity 4-1 page 54 Activity 4-2 page 57 Exercise 4-18 page 73 12

Wrap up Do not confuse the difference between sample size and the number of sample done in a study. Although the role of the sample is crucial to assessing how a sample statistics varies from one sample to sample. The size of the sample will not effects the sampling variability. As long as the population is large relative to the sample size (at least 10 times as large), the precision of a sample statistics depends on the sample size and not on the population size. Topic 5: Designing Experiments SELF STUDY QUIZ Assignment 1 13