Chapter 1 Where Do Data Come From?

Size: px
Start display at page:

Download "Chapter 1 Where Do Data Come From?"

Transcription

1 Chapter 1 Where Do Data Come From? Understanding Data: The purpose of this class; to be able to read the newspaper and know what the heck they re talking about! To be able to go to the casino and know why the always wins. Statistics: study of how to collect, organize, analyze, and interpret information The advantage of statistics is that it gives a process for making decisions when faced with uncertainties without prejudice Statistics are used in many fields Examples: Medical: what are the chances a patient will go into remission with a certain cancer treatment Education: does writing material down increase the ability to remember facts Population: the collection of individuals or items of interest Examples: All residents of Kentucky All admits to hospitals in U.S. Maybe you want to know what Wayne state students like to do on a Friday night. Your population: Wayne state students The population is defined in terms of our desire for knowledge Census: measurements from the entire population are used Every 10 years the U.S. conducts a census o They attempt to reach every resident in the United States o Some people are difficult to reach Often, it is not feasible to study the entire population Instead of the entire population, we often take measurements from a subset. Sample: the subset of the population on which we make measurements We call the measurements Data We don t ask everyone in the population. We ask part of the population: or a sample.

2 Where do data come from? Individuals: The objects described by a set of data (Can be people, animals, or things) Variable: Any characteristic of an individual that can take different values for different individuals (we collect data on the variables that we are interested in) We conduct a study to collect and process data Types of studies: Observational study: Observes individuals and measures variables of interest but does not attempt to influence responses Sample survey: Type of observational study in which a sample is selected and asked to respond to questions Examples Public opinion polls Pre-election polls Teacher evaluations Experiment: Deliberately imposes some treatment on individuals in order to observe their responses (purpose is to study if the treatment causes a change in the response) Examples Medical study: patients are given drugs at various dosage levels to study effectiveness Change a container from to and see if individuals notice the difference There are two main parts in the science of statistics 1) Descriptive Statistics: methods of summarizing a set of data 2) Inferential Statistics:

3 methods of making inference about a population based on the information in a sample Chapter 2 Samples, Good and Bad Bias a prejudice in one direction Going to the Democratic National convention and asking who each person voted for would create bias. Common ways of creating bias: Convenience Sampling Uses results or data that are conveniently and readily obtained (Runs risk of being severely biased!) Asking your friends is an example of convenience sampling Example: Voluntary response samples These often over represent people with strong opinions Example: Restaurant comment cards People that are willing to volunteer an opinion usually have a strong opinion and it s usually not good. Not many people take time to say I m Happy! This type of data is biased and should not be generalized to the overall population For a sample to be useful, it needs to represent the population! This is important since we usually want to extend the results to the population The sample should be similar to the population in terms of demographics and other variables One way to do this is with a random sample a sample determined completely by

4 A simple random sample or of n measurements from a population is one selected in such a manner that every sample of size n from the population has equal probability of being selected With a random sample Our sample will typically be similar to our population with respect to demographic characteristics We can control the probability of making a mistake (or probability of error)

5 Chapter 3 What Do Samples Tell Us? Statistic: Example: Of 100 people surveyed, 37 said they would rather take a train than drive. Our statistic is 37% of the people surveyed (our in this case, our sample) A numerical characteristic of a sample This value is known when we take our sample, but it will from sample to sample Parameter: p A numerical characteristic of a population This value is a fixed number, but when doing inferential statistics we will not know its value (unless we take a census) Since the population is often not available, we use statistics to estimate parameters Variability Describes the spread of the values of the statistics We can control this variability since a larger sample will force less variation To reduce bias we should use random sampling From the sampling variability we can calculate the margin of error The margin of error: Gives us a way of estimating the parameter, given a statistic. If the margin of error is ± 2% and the statistic is 58% of people believe, then 95% of samples would have a statistic between 56% and 60% (plus or minus 2% from our statistic), so we can confidently say that between 56% and 60% of people believe

6 We can say with 95% confidence that the amount by which a proportion obtained from a sample will differ from the population proportion will not exceed 1 where n is the number of people in the sample. n Example: A sample proportion of 50% and a sample of 1600 people: The Margin of Error? % n The Confidence Statement? We are 95% confident that the population parameter is between 47.5% and 52.5% How large a sample is large enough? What are the factors? How confident do we want to be in our conclusions? How much variability is in our data?

7 Chapter 5 Experiments, Good and Bad Back off, man. I'm a scientist To conduct a study properly, you must do the following: Get a representative sample Get a large enough sample Decide whether the study should be an observational study or an experiment A response variable (dependent variable) is a variable that measures an outcome or result of a study This just in: According to statisticians, there is a link between having a hangover and doing poorly on an exam. While the experiment is not conclusive, statisticians recommend not trying to take an exam after a night of drinking. Response variable: Exam scores, Explanatory variable: Whether or not you have a hangover. An explanatory variable (independent variable) is a variable that attempts to explain or causes changes in the response variable In an experiment, we create differences in the explanatory variable and then examine the results

8 In an observational study, we observe differences in the explanatory variable and then notice whether these are related to differences in the response variable Observational study: We observe that televisions that have a larger screen size, weigh more. Experiment: We noticed that when patients were given a placebo, they rated their mood higher than if they did not take the placebo. Not to be confused with the placebo effect (a lurking variable for most drug related experiments). The individuals studied in an experiment are often called subjects A Treatment is one or a combination of explanatory variables assigned by the experimenter A treatment diagram is useful in determining if all combinations of the explanatory variables have been used. Suppose we are conducting an experiment on weight loss and the explanatory variables we want to use is diet (fat-free and Adkins) and exercise (swimming and walking). Diet Fat-free Adkins Exercise Swimming 1 2 Walking 3 4 When conducting an experiment it is important to randomly assign individuals to one of the treatment groups (Random assignment is equivalent to flipping a coin to decide group membership) Placebos are given to subjects that look similar to the treatment being given in the experiment Control Groups are used to help control lurking variables (variables that have an effect on the response variable but are not part of the study) A Confounding variable is a variable whose effect on the response variable cannot be separated from the effect of an explanatory variable Confounding variables are examples of lurking variables.

9 An Interaction occurs when the effect of one explanatory variable on the response variable depends on what is happening with another explanatory variable An observed effect so large that it would rarely occur by chance is called statistically significant Chapter 6 Experiments in the Real World Nonadherers subjects who participate but do not follow the experimental treatments In a study that tries to determine if a cough medicine helps the patient with pain, a nonadherer may be that person that continues to make themselves hot toddies for their cold, while continuing to be a subject of the experiment. The experimenter will not be able to tell if the medicine or the hot toddy was benefiting pain. Dropouts subjects who begin the experiment but do not complete it Blinding (single blind) only the administrator knows if the subjects receive the treatment or placebo (double blind) neither the subjects nor administrator know what is being given Completely randomized experimental design all the subjects are allocated at random among all treatment groups Did someone say SRS? A block is a group of subjects that are known before the experiment to be similar in some way that is expected to affect the response of the treatments Example: Some of the subjects are pregnant. In a block design, the random assignment of subjects to treatments is carried out separately within each block Example: Do the an SRS for the pregnant women and an SRS for the rest of the subjects.

10 Chapter 8 Measuring Measure- assign a number to represent a property Instrument- something used to measure Units- the type of values our measurements take Validity a measurement is valid if it is relevant or appropriate when representing a property Story: Goober loved to measure the weight of different chocolate cakes, but Goober is a few candles short of a birthday. Goober didn t have a problem with his instrument, he used a calibrated scale he bought at his favorite store WEIGH STUFF MART. His units were not bad either; Goober weighed his cakes in ounces. Goober s problem was that he weighed the cakes, by holding them and stepping on the scale. Goober thinks the average cake weighs approximately 2112 ounces. You could say his measurements are not entirely valid. Often a rate (percent) at which something occurs is a more valid measure than a frequency We have 328 chocolate shops in my city! Rates may be a better option for comparing WOW! I wish we had that many we only have 38 chocolate shops! NYC Resident Population: 8.3 million Ionia Resident Population: 11.4 thousand Reliability a measurement is reliable if it is the same time after time when taken on the same individual Variability Consistency across measures

11 It might be best, when trying to get Garfield s weight, to weigh him more than once and see that the measurement is consistent. If Garfield weighs 122.1, 122.1, 122.1, 122.2, 122.1, after stepping on the scale 6 times, we see that the variance is small and that is a reliable measurement. Variance- A value used to determine if random error is small (so that our measurement is reliable) Types of data Qualitative (Categorical) variable places responses into categories with no logical ordering Quantitative variable numeric values that can be ordered and mathematical operations can be performed such as finding an average Discrete variable things that can be counted Continuous variable things that are measured Run a race in 2:03? Time is continuous nobody counted your seconds.

12 Chapter 10 Graphs, Good and Bad The distribution of a variable tells what values it takes and how often it takes these values Pie Chart displays the division of a total quantity Used only for qualitative data Should not include too many categories The number of degrees for each wedge should be proportional to the percentage The total percentage must add to be 100% Two categories: pie eaten and pie not yet eaten. Notice percentage of pie eaten is approximately 33% and the associated angle is.33(360) or approximately 120 degrees Bar Graph displays frequency or percentage of items in each category Can be used for more than one categorical variable The bars can be vertical or horizontal The bars should be of uniform width and uniformly spaced The length of a bar represents the quantity we wish to compare I found this bar graph online and thought the title was amusing. I imagined a study where someone asked children What is your favorite juice? and most students replied Yellow! I m stumped what is yellow juice?

13 Line Graph (Time Plot) shows the relationship between a quantitative variable and time Time is the horizontal scale (x-axis) The quantitative variable being measured is the vertical scale (y-axis) This next example seems contradictory to my beliefs but it was also kind of amusing A time series is a record of a variable over time. A steady change over time is called a trend. A seasonal component in time series means that the variable tends to be higher at certain points in time and lower at certain points in time. All other variation can be explained by irregular cycles and random fluctuations. Chapter 11 Displaying Distributions with Graphs Extreme value or Outlier observations that are separated from the rest of the data set by some margin Imagine a study where you asked people at an M&M conference how many M&Ms they ate each day and the results were: 32, 33, 45, 67, 28, 32, 40, 0, 32, 41, 879, 33 We see that 0 and 879 are outliers. Who goes to an M&M conference without eating M&Ms? I d be a little nervous about that person. I d also be a little nervous about the person who consumes 879 M&Ms!

14 Shape the pattern displayed when the graph is created Stem and Leaf separates data entries into leading digits or stems and trailing digits or leaves. Features: A device that organizes and groups data but allows us to recover the original data if desired Good for spotting extreme values and identifying shape 14 male weights in pounds 139,153,179,201,163,168,157,170,172,165,145,155,161,151 stem tens of pounds leaf one pounds A stem and leaf plot for inches of snow per day for the first week of May? Frequency distribution a summary table in which the data are arranged into conveniently established class groupings. should have between 5 and 15 classes each class grouping should be of equal width overlapping the classes must be avoided useful when dealing with very large data sets through the grouping process the original data is lost class midpoint the point halfway between the boundaries of each class.

15 Weight Frequency 130 but less than but less than but less than but less than but less than but less than but less than but less than Total 14 Histogram a picture of a frequency distribution Shapes of histograms Symmetrical both sides are the same when the graph is folded vertically Uniform every class has equal frequency Skewed Left or Skewed Right one tail is stretched longer than the other. The direction of the skewness is on the side of the longer tail. Bimodal the two classes with largest frequencies are separated by at least one class

16 Chapter 12 Describing Distributions with Numbers Measure of Central Tendency Description of Average (Typical Value) sample mean: (simple average) where n is the sample size and are the observations. The sum of the data values divided by the sample size. Select 4 students and ask how many brothers and sisters do you have? Answers: 2,3,1,3 Notice if the fourth person had responded that they had 10 brothers instead of 3; the mean would be 4 instead. This shows that the mean is influenced by extreme values. Here is something that is not influenced by extreme values: sample median: (middle score) rank data from smallest to largest if n is odd, median is the middle score if n is even, median is the average of two middle scores (number of siblings) observations: 2,3,1,3 1,2,3,3

17 1,2,3,10 Observations (with the fourth responder saying 10 instead of 3): 2,3,1, 10 sample mode: most frequent score Observations: 2,3,1,3 Mode = 3 What if there is no mode?! If no number occurs more than once, we say there is no mode, but if two numbers tie for the number of occurrences, then each observation gets the title of mode. does not always exist/can be more than one Unstable (If we start rounding, the mode can change drastically) can be used with qualitative data Measures of Dispersion (Variability) Distribution #1 Distribution # The mean, median and mode are all 35 in both distributions above, but there is a big difference between the two distributions! How we measure the differences: sample range: (highest observation) (lowest observation) Years of experience of faculty 1, 30, 22, 10, 5 sample range = 30-1 = 29 years This is easy to compute and totally sensitive to extreme scores. Sample Variance: measures the average squared distances from the mean. Sample Standard Deviation: The square root of the sample variance and measures the average distances from the mean.

18 Standard deviation is incredibly important to class and we will discuss the formulas and how to compute in class. Measures of Position Quartiles - divide the data into four equally sized parts First Quartile, : 25% of the data lies below 75% of the data lies above Second Quartile (median), : 50% of the data lies below 50% of data lies above Third Quartile, : 75% of the data lies below 25% of the data lies above Procedure to Compute Quartiles 1) Order the data from smallest to largest 2) Find the median. This is the 2 nd Quartile 3) is the median of the lower half of the data 4) is the median of the upper half of the data In the event that there is an odd number of observations, you will take out the median before computing the first and third quartiles. In the event that there is an even number of observations, you will leave all the observations in, when computing the first and third quartiles. 5 number summary: Min,, median,, Max Interquartile range (IQR) = Range of Middle 50% of the data Students Faculty Students Faculty Min = 0 Min = 10 = 1 = 15 = 5 = 25 = 7 = 31 Max = 10 Max = 73

19 Boxplots: Procedure 1) Draw a scale to include the lowest and highest data value (USE EVEN INCREMENTS!) 2) Draw a box from to 3) Draw a solid line through the box at the median 4) Draw solid lines, called whiskers, from to the lowest value and from to the highest value Chapter 13 Normal Distributions As the class widths for a histogram become smaller and smaller, the top of the histogram becomes more curvelike. We set up these curves so that the area under the curve represents the proportion of observations This is known as the density curve and is the most common way of representing a population Another way to determine shape is by comparing the mean and median The median of a density curve is the point that divides the area in half

20 The mean of a density curve is the balance point of the density function Because of this if the mean and median are equal then the distribution is symmetric If the mean is greater than the median then the curve is skewed right If the mean is less than the median then the curve is skewed left If the curve follows a normal distribution (Gaussian distribution) then it will be a bell-shaped curve Density curves are useful in determining what proportion or percentage of the population falls within an interval The area under the curve represents this proportion and the total area is 1 The normal distribution is characterized by or (population mean) and (population standard deviation) A normal curve with a 0 and 1 is called the standard normal curve

21 A percentile represents the position of your measurement in comparison with everyone else s and gives the percentage of the population that falls below you. To find a percentile we will use standardized scores (z-scores), denoted z Example If your height is 70 inches, and the heights of the class are normally distributed with 65 and 5, then you have a z 1 That is your height is 1 standard deviation above the mean z x z-scores allow us to transform any normal curve into a standard normal curve Empirical Rule The z-score for an observation is just the number of standard deviations, the observation is above the mean Approximately 68% of the data fall within 1 standard deviation of the mean x s, x s Approximately 95% of the data fall within 2 standard deviations of the mean x 2s, x 2s Approximately 99.7% of the data fall within 3 standard deviations of the mean x 3s, x 3s For a normal distribution, the empirical rule gives exact percentages If an observation is not 1, 2, or 3 standard deviations from the mean, we cannot use the rule. To determine percentages, we use a z-score table, like the one on the next page.

22

23 Chapter 14 Describing Relationships: Scatterplots and Correlation Scatterplot or Scatter diagram displays the relationship between two quantitative variables x-axis independent variable explanatory variable y-axis dependent variable response variable Example: Age (x) Height (y) Correlation - a measure of association that tests whether a relationship exists between two variables In general we will be looking for linear correlations i.e. how closely the data follows a line when plotted.

24 The correlation coefficient (denoted r) is a value which measures correlation and indicates both the strength of the association and its direction. Positive r suggests that as the x values increase, so do the y values. It also happens that as the x values decrease, so do the y values. Negative r suggests the opposite; when the x values increase, the y values decrease and when the x values decrease, the y values increase. We will always have that 1 r 1 when r is close to 1 (data is close to a straight line with positive slope) when r is close to -1 (data is close to a straight line with negative slope) r 0 (no linear relationship) The stronger the linear relationship, the closer r is to -1 or 1. Generally, we will say there is a strong relationship if r. 75

25 Note that it is never possible to prove causality just based on the relationship between two variables There is a strong statistical correlation over months of the year between ice cream consumption and the number of assaults in the U.S. Does this mean ice cream manufacturers are responsible for crime? No! The correlation occurs statistically because the hot temperatures of summer increase both ice cream consumption and assaults Thus, correlation does NOT imply causation Other factors besides cause and effect can create an observed correlation To establish whether two variables are causally related you must establish: 1) Time order - The cause must have occurred before the effect 2) Co-variation (statistical association) The correlation coefficient must show a strong relationship between the dependent and independent variable 3) Rationale - There must be a logical and compelling explanation for why these two variables are related 4) Non-spuriousness - It must be established that the independent variable X, and only X, was the cause of changes in the dependent variable Y; rival explanations must be ruled out This type of research is very complex and the researcher can never be completely certain that there are not other factors influencing the causal relationship To help identify a relationship as cause and effect a study is often performed many times The study should yield the same results every time it is conducted (if this occurs it helps rule out rival explanations)

26 Chapter 15 Describing Relationships: Regression, Prediction, and Causation Linear Regression Purpose of linear regression: To predict the value of a difficult to measure variable, y (response variable), based on an easy to measure variable, x (explanatory variable) Example Predict the finishing time of the men s 100 meter dash in 2032 We do this by using a line that fits the data, called a regression line. Lines have equations of the form: Where b is the y-intercept and m is the slope In order to use linear regression, make sure the model is reasonable (the scatter plot and r should indicate a strong correlation) We need a line that is the best fit for our data To accomplish this we will use the method of least-squares

27 To find the least squares regression line, we are essentially looking to minimize the area of the squares created from a possible regression line and our observations: While we do not do this to compute, we find that the least squares regression line can be found by seeing that and We will do an example in class (be sure to include in your notes) Interpolation Predicting y values for x values that are within the range of the scatter plot (This is what regression should be used for) Extrapolation Predicting y values for x values beyond the range of the observations (In general, this should not be done as it can pose a problem) It is possible to create a scatter plot where the explanatory variable is age and the response variable is height. When comparing a child s height to her age, it may seem as if the data has a strong linear correlation. By using a regression line to extrapolate, we might find that at the age of 28 we would expect a height of 7 feet. This problem is present because our growth over time is not typically linear.

28 Chapter 17 & 18 Thinking about Chance & Probability When thinking about chance, we consider outcomes or the possible things that can happen. When rolling 2 dice an example of something that can happen is rolling snake eyes (both land on one). Rolling 2 ones is an example of an outcome when rolling 2 dice. We call the collection of outcomes we care about an event. When playing cards we might like to consider the chance of getting a royal flush. Our event would be getting a royal flush, which has 4 outcomes (one for each suit)

29 The probability of an event A, denoted P (A) experiment were performed a large number of times, is the expected proportion of occurrences of A if the In general we can compute probability if the chance of each outcome is equal. In this case: The set of all possible outcomes is called the sample space Examples: Roll a d20 (20 sided die) Sample space: {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} Flip a coin sample space {H,T} Usually it will be hard to list the entire sample space. For instance, listing out all of the possible card hands a very long time. We therefore resort to counting principles. Counting principals How many ways are there to arrange the letters in the word SUNDAY? Here we have 6 letters which cannot have repeats. We have 6 choices for the first letter, this leaves us with 5 choices for the second letter, 4 for the third and finally one choice for the sixth letter, giving us 6x5x4x3x2x1 = 720 ways to arrange the letters.

30 If a burglar system has a 3 digit code and each digit is a number between 0 and 9, then how many possible codes are there? Here we have 10 different numbers and we are okay with repeats so there are 10 choices for each of the 3 digits giving us =1000 different combinations If we are interested in looking at a series of operations then a device called a tree diagram is useful for determining the sample space Flip a Penny, Nickel, and a Dime: In this tree we see that there are 8 possible outcomes. This gives us the ability to compute probabilities: If A is the event of getting all three tails, what is P (A)? If B is the event of getting exactly two tails, what is P (B)? If C is the event of getting exactly one tail, what is P (C)? If D is the event of getting no tails, what is P (D)? Thinking about the above counting techniques we see that there should be some formulas at our disposal.

31 The complement of an event A, denoted, is all outcomes not in A The complement rule P( A) 1 P( A) The addition rule P( Aor B) P( A) P( B) P( Aand B) If we are interested in knowing if event A occurred given that we know event B occurred, this is known as conditional probability, denoted A B or A given B The conditional probability rule P( A and B) P( A B) P( B) The multiplication rule P( Aand B) P( A) P( B A) We say the events A and B are mutually exclusive or disjoint if they cannot occur together. P ( Aand B) 0 Two events are said to be independent if the occurrence (or nonoccurrence) of one does not effect the probability of occurrence of the other. P( A) P( A B) Events that are not independent are dependent.

32 P( A) P( A B) Example Draw two cards without replacement A {first card is an ace} B {second card is an ace} A and B are dependent P( A and B) P( A) P( B A) ( 4 )( 3 ) Suppose we return the first card thoroughly shuffle before we draw the second A and B are independent P( A and B) P( A) P( B A) ( 4 )( 4 ) We can also use a density curve when our outcomes are not discrete, but continuous. Example: determining the probability that a sample statistic of with a sample size of 100 is within 10% of the parameter. AT THIS TIME YOU MIGHT BE THINKING MARGIN OF ERROR, MARGIN OF ERROR, MARGIN OF ERROR It turns out that in this case there is a 95% chance since the margin of error with a sample size of 100 is 10%. This type of probability model is continuous. We compute probabilities using areas under the density curve. We also require that the total area under the density curve is 1.

33 Chapter 20 The House Edge: Expected Values Mean and Standard Deviation of a Discrete Probability Distribution One of the most asked questions for probability: If the probability that I win $10 is ¼, $20 is ¼ and $0 is ½, what will I win on average? The mean (denoted ) of a probability model is outcome one would expect on average. The equation for mean is not much different than the old equation for mean. If you win $10 ¼ of the time, $20 ¼ of the time and $0 ½ of the time, you would expect out of every 4 plays to get $10, $20, $0 and $0, so your mean would be but this is equal to The standard deviation (denoted ) of a probability model is the weighted average distance each outcome is from the mean (where the weighting is given by the probabilities)

34 Chapter 21 What is a Confidence Interval? In Chapter 3 we talked about 95% confidence statements Reminder: A statistic from a sample of size n has a margin of error of approximately This is because the statistics of sample sizes n are normally distributed with a mean equal to the parameter and standard deviation of approximately While, what we did in Chapter 3 was great, something worth noting is that the standard deviation of the distribution of statistics should also be based on the parameter you wouldn t expect to allow a confidence interval of something like: I m 95% confident that between 90 and 110 percent of people like brownies. This comes about because we could have a parameter of 100%, but our sample size is small I mean, who doesn t like brownies? Let the new standard deviation be defined to be: where p is the statistic from a sample size of n individuals.

35 Confidence intervals are about to blow your mind!. Sampling Distribution of the Sample Mean Suppose you don t just want to know what percentage of the population has brown eyes or other characteristic variables like that. Suppose you want to know the average IQ of the American population; you want to know the mean of some quantitative variable. We only have the tools thus far, to approximate a percentage of the population that has some characteristic. If we wanted to answer a quantitative question, we could only say things like 54% of the population has more than 2 cats, when we would like to say things like the average person has 1.7 cats You kind of have to feel bad for the 7/10 of a cat running around To approximate the mean we notice the following: If we have a sample of size n, we can compute the mean value of the observations. If we consider the collection of mean values from all samples of size n (just like we did when we looked at statistics), we see that the values that takes, are normally distributed with a mean, which we will call and a standard deviation of, where is the actual standard deviation of all observations from the population.

36 What can we do with this? Suppose we have a sample of size n We can compute the mean, which we will call We can compute the standard deviation, which we will call s From this we can use the fact that the sample means are normally distributed and approximate the mean for the population to be, and approximate the standard deviation of this distribution to be, which gives us that we can make a 95% confidence interval: I am 95% confident that the population mean is between and

37 Chapter 22 What is a Test of Significance? Statistical hypotheses statements about population parameter Suppose you think people can t tell the difference between sugar and artificial sweeteners. Your hypothesis: 50% of people would say they like sugar better in a blind taste test Notice: Hypotheses are not necessarily correct! In statistics, we test one hypothesis against another: The hypothesis that we want to prove is called the alternative hypothesis, We might want to show that sugar is actually preferred or that our parameter is greater than 50% H a The hypothesis that is contradictory to is called the null hypothesis, H 0 To determine if is believable we conduct a study with a sample and either Reject H 0 and believe Or Fail to reject H 0 because there was not sufficient evidence to reject it H a

38 Example: Suppose it is believed by others that there is no difference between sugar and artificial sweeteners, but you believe that sugar is better liked. Null hypothesis is that half of the population would like sugar better Alternative hypothesis would be that more than half of the population would like sugar better Now suppose we ask a sample of 100 people and see that 63% of them like sugar better. This certainly suggests that the null hypothesis is wrong, but could it have been coincidental? Notice that if then we would expect that the sample statistics of samples of size 100 would be normally distributed with mean of.5 and we would expect a standard deviation of so 68% of samples should have statistics between 45% and 55%, 95% of samples should have statistics between 40% and 60%. Because of this we can see how unlikely it is becoming that we got one of the few samples with a statistic as large as 63% In fact, we can determine the probability that such a thing would occur: notice the z-score of 63% is And the associated percentile is 99.53% so the likelihood that a statistic as high as 64% is =.47% The likelihood that the statistic take place if the null hypothesis is true is referred to as the P-Value Because it is highly unlikely that we would get a statistic of 63% if the null hypothesis were true, we reject the null hypothesis. Since the statistics satisfied the alternative hypothesis, we also use this as evidence that the alternative hypothesis is in fact true. If we want to be sure that the null hypothesis is untrue, we can adjust the P-Value we are looking for. The level that the P-Value must be under is referred to as the level of significance. For instance, if we were testing to a level of significance of.1%, we would not reject in the above example, but we would if the level of significance was 1%

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Still important ideas

Still important ideas Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still

More information

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis:

Population. Sample. AP Statistics Notes for Chapter 1 Section 1.0 Making Sense of Data. Statistics: Data Analysis: Section 1.0 Making Sense of Data Statistics: Data Analysis: Individuals objects described by a set of data Variable any characteristic of an individual Categorical Variable places an individual into one

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60

M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60 M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 1-10 10 11 3 12 4 13 3 14 10 15 14 16 10 17 7 18 4 19 4 Total 60 Multiple choice questions (1 point each) For questions

More information

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions Readings: OpenStax Textbook - Chapters 1 5 (online) Appendix D & E (online) Plous - Chapters 1, 5, 6, 13 (online) Introductory comments Describe how familiarity with statistical methods can - be associated

More information

V. Gathering and Exploring Data

V. Gathering and Exploring Data V. Gathering and Exploring Data With the language of probability in our vocabulary, we re now ready to talk about sampling and analyzing data. Data Analysis We can divide statistical methods into roughly

More information

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences. SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods

More information

Understandable Statistics

Understandable Statistics Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement

More information

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Still important ideas Contrast the measurement of observable actions (and/or characteristics)

More information

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)

More information

Undertaking statistical analysis of

Undertaking statistical analysis of Descriptive statistics: Simply telling a story Laura Delaney introduces the principles of descriptive statistical analysis and presents an overview of the various ways in which data can be presented by

More information

Statistical Methods Exam I Review

Statistical Methods Exam I Review Statistical Methods Exam I Review Professor: Dr. Kathleen Suchora SI Leader: Camila M. DISCLAIMER: I have created this review sheet to supplement your studies for your first exam. I am a student here at

More information

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error

I. Introduction and Data Collection B. Sampling. 1. Bias. In this section Bias Random Sampling Sampling Error I. Introduction and Data Collection B. Sampling In this section Bias Random Sampling Sampling Error 1. Bias Bias a prejudice in one direction (this occurs when the sample is selected in such a way that

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30%

Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30% 1 Intro to statistics Continued 2 Grading policy Quizzes (and relevant lab exercises): 20% Midterm exams (2): 25% each Final exam: 30% Cutoffs based on final avgs (A, B, C): 91-100, 82-90, 73-81 3 Numerical

More information

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0% Capstone Test (will consist of FOUR quizzes and the FINAL test grade will be an average of the four quizzes). Capstone #1: Review of Chapters 1-3 Capstone #2: Review of Chapter 4 Capstone #3: Review of

More information

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017

Table of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017 Essential Statistics for Nursing Research Kristen Carlin, MPH Seattle Nursing Research Workshop January 30, 2017 Table of Contents Plots Descriptive statistics Sample size/power Correlations Hypothesis

More information

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points.

Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points. Part 1. For each of the following questions fill-in the blanks. Each question is worth 2 points. 1. The bell-shaped frequency curve is so common that if a population has this shape, the measurements are

More information

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables

Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables Chapter 3: Investigating associations between two variables Further Mathematics 2018 CORE: Data analysis Chapter 3 Investigating associations between two variables Extract from Study Design Key knowledge

More information

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013# UF#Stats#Club#STA##Exam##Review#Packet# #Fall## The following data consists of the scores the Gators basketball team scored during the 8 games played in the - season. 84 74 66 58 79 8 7 64 8 6 78 79 77

More information

AP Stats Review for Midterm

AP Stats Review for Midterm AP Stats Review for Midterm NAME: Format: 10% of final grade. There will be 20 multiple-choice questions and 3 free response questions. The multiple-choice questions will be worth 2 points each and the

More information

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months?

Medical Statistics 1. Basic Concepts Farhad Pishgar. Defining the data. Alive after 6 months? Medical Statistics 1 Basic Concepts Farhad Pishgar Defining the data Population and samples Except when a full census is taken, we collect data on a sample from a much larger group called the population.

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Psychology Research Process

Psychology Research Process Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:

More information

Statistics are commonly used in most fields of study and are regularly seen in newspapers, on television, and in professional work.

Statistics are commonly used in most fields of study and are regularly seen in newspapers, on television, and in professional work. I. Introduction and Data Collection A. Introduction to Statistics In this section Basic Statistical Terminology Branches of Statistics Types of Studies Types of Data Levels of Measurement 1. Basic Statistical

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3

Example The median earnings of the 28 male students is the average of the 14th and 15th, or 3+3 Lecture 3 Nancy Pfenning Stats 1000 We learned last time how to construct a stemplot to display a single quantitative variable. A back-to-back stemplot is a useful display tool when we are interested in

More information

Review+Practice. May 30, 2012

Review+Practice. May 30, 2012 Review+Practice May 30, 2012 Final: Tuesday June 5 8:30-10:20 Venue: Sections AA and AB (EEB 125), sections AC and AD (EEB 105), sections AE and AF (SIG 134) Format: Short answer. Bring: calculator, BRAINS

More information

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75 M 140 est 1 A Name (1 point) SHOW YOUR WORK FOR FULL CREDI! Problem Max. Points Your Points 1-10 10 11 10 12 3 13 4 14 18 15 8 16 7 17 14 otal 75 Multiple choice questions (1 point each) For questions

More information

STATISTICS INFORMED DECISIONS USING DATA

STATISTICS INFORMED DECISIONS USING DATA STATISTICS INFORMED DECISIONS USING DATA Fifth Edition Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation Learning Objectives 1. Draw and interpret scatter diagrams

More information

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments

Outline. Practice. Confounding Variables. Discuss. Observational Studies vs Experiments. Observational Studies vs Experiments 1 2 Outline Finish sampling slides from Tuesday. Study design what do you do with the subjects/units once you select them? (OI Sections 1.4-1.5) Observational studies vs. experiments Descriptive statistics

More information

STATISTICS & PROBABILITY

STATISTICS & PROBABILITY STATISTICS & PROBABILITY LAWRENCE HIGH SCHOOL STATISTICS & PROBABILITY CURRICULUM MAP 2015-2016 Quarter 1 Unit 1 Collecting Data and Drawing Conclusions Unit 2 Summarizing Data Quarter 2 Unit 3 Randomness

More information

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA 15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA Statistics does all kinds of stuff to describe data Talk about baseball, other useful stuff We can calculate the probability.

More information

Welcome to OSA Training Statistics Part II

Welcome to OSA Training Statistics Part II Welcome to OSA Training Statistics Part II Course Summary Using data about a population to draw graphs Frequency distribution and variability within populations Bell Curves: What are they and where do

More information

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS Circle the best answer. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners to the lung

More information

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego

Biostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego Biostatistics Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego (858) 534-1818 dsilverstein@ucsd.edu Introduction Overview of statistical

More information

AP Statistics. Semester One Review Part 1 Chapters 1-5

AP Statistics. Semester One Review Part 1 Chapters 1-5 AP Statistics Semester One Review Part 1 Chapters 1-5 AP Statistics Topics Describing Data Producing Data Probability Statistical Inference Describing Data Ch 1: Describing Data: Graphically and Numerically

More information

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA Data Analysis: Describing Data CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA In the analysis process, the researcher tries to evaluate the data collected both from written documents and from other sources such

More information

Introduction to Statistical Data Analysis I

Introduction to Statistical Data Analysis I Introduction to Statistical Data Analysis I JULY 2011 Afsaneh Yazdani Preface What is Statistics? Preface What is Statistics? Science of: designing studies or experiments, collecting data Summarizing/modeling/analyzing

More information

3.2 Least- Squares Regression

3.2 Least- Squares Regression 3.2 Least- Squares Regression Linear (straight- line) relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these

More information

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph. STAT 280 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

More information

STAT 201 Chapter 3. Association and Regression

STAT 201 Chapter 3. Association and Regression STAT 201 Chapter 3 Association and Regression 1 Association of Variables Two Categorical Variables Response Variable (dependent variable): the outcome variable whose variation is being studied Explanatory

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

CP Statistics Sem 1 Final Exam Review

CP Statistics Sem 1 Final Exam Review Name: _ Period: ID: A CP Statistics Sem 1 Final Exam Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A particularly common question in the study

More information

Test 1C AP Statistics Name:

Test 1C AP Statistics Name: Test 1C AP Statistics Name: Part 1: Multiple Choice. Circle the letter corresponding to the best answer. 1. At the beginning of the school year, a high-school teacher asks every student in her classes

More information

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE

Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE 1. When you assert that it is improbable that the mean intelligence test score of a particular group is 100, you are using. a. descriptive

More information

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj

Statistical Techniques. Masoud Mansoury and Anas Abulfaraj Statistical Techniques Masoud Mansoury and Anas Abulfaraj What is Statistics? https://www.youtube.com/watch?v=lmmzj7599pw The definition of Statistics The practice or science of collecting and analyzing

More information

Chapter 2--Norms and Basic Statistics for Testing

Chapter 2--Norms and Basic Statistics for Testing Chapter 2--Norms and Basic Statistics for Testing Student: 1. Statistical procedures that summarize and describe a series of observations are called A. inferential statistics. B. descriptive statistics.

More information

Appendix B Statistical Methods

Appendix B Statistical Methods Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon

More information

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13

Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13 Readings: Textbook readings: OpenStax - Chapters 1 4 Online readings: Appendix D, E & F Online readings: Plous - Chapters 1, 5, 6, 13 Introductory comments Describe how familiarity with statistical methods

More information

t-test for r Copyright 2000 Tom Malloy. All rights reserved

t-test for r Copyright 2000 Tom Malloy. All rights reserved t-test for r Copyright 2000 Tom Malloy. All rights reserved This is the text of the in-class lecture which accompanied the Authorware visual graphics on this topic. You may print this text out and use

More information

Political Science 15, Winter 2014 Final Review

Political Science 15, Winter 2014 Final Review Political Science 15, Winter 2014 Final Review The major topics covered in class are listed below. You should also take a look at the readings listed on the class website. Studying Politics Scientifically

More information

Lesson 1: Distributions and Their Shapes

Lesson 1: Distributions and Their Shapes Lesson 1 Name Date Lesson 1: Distributions and Their Shapes 1. Sam said that a typical flight delay for the sixty BigAir flights was approximately one hour. Do you agree? Why or why not? 2. Sam said that

More information

Chapter 1. Picturing Distributions with Graphs

Chapter 1. Picturing Distributions with Graphs Chapter 1 Picturing Distributions with Graphs Statistics Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves

More information

3.2A Least-Squares Regression

3.2A Least-Squares Regression 3.2A Least-Squares Regression Linear (straight-line) relationships between two quantitative variables are pretty common and easy to understand. Our instinct when looking at a scatterplot of data is to

More information

Descriptive Statistics Lecture

Descriptive Statistics Lecture Definitions: Lecture Psychology 280 Orange Coast College 2/1/2006 Statistics have been defined as a collection of methods for planning experiments, obtaining data, and then analyzing, interpreting and

More information

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time. Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time. While a team of scientists, veterinarians, zoologists and

More information

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of numbers. Also, students will understand why some measures

More information

Observational studies; descriptive statistics

Observational studies; descriptive statistics Observational studies; descriptive statistics Patrick Breheny August 30 Patrick Breheny University of Iowa Biostatistical Methods I (BIOS 5710) 1 / 38 Observational studies Association versus causation

More information

Measuring the User Experience

Measuring the User Experience Measuring the User Experience Collecting, Analyzing, and Presenting Usability Metrics Chapter 2 Background Tom Tullis and Bill Albert Morgan Kaufmann, 2008 ISBN 978-0123735584 Introduction Purpose Provide

More information

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships CHAPTER 3 Describing Relationships 3.1 Scatterplots and Correlation The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Reading Quiz 3.1 True/False 1.

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test February 2016 STAB22H3 Statistics I, LEC 01 and LEC 02 Duration: 1 hour and 45 minutes Last Name: First Name:

More information

Chapter 3 CORRELATION AND REGRESSION

Chapter 3 CORRELATION AND REGRESSION CORRELATION AND REGRESSION TOPIC SLIDE Linear Regression Defined 2 Regression Equation 3 The Slope or b 4 The Y-Intercept or a 5 What Value of the Y-Variable Should be Predicted When r = 0? 7 The Regression

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships Objectives: Students will: Construct and interpret a scatterplot for a set of bivariate data. Compute and interpret the correlation, r, between two variables. Demonstrate

More information

Stats 95. Statistical analysis without compelling presentation is annoying at best and catastrophic at worst. From raw numbers to meaningful pictures

Stats 95. Statistical analysis without compelling presentation is annoying at best and catastrophic at worst. From raw numbers to meaningful pictures Stats 95 Statistical analysis without compelling presentation is annoying at best and catastrophic at worst. From raw numbers to meaningful pictures Stats 95 Why Stats? 200 countries over 200 years http://www.youtube.com/watch?v=jbksrlysojo

More information

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell q 1. Here is a histogram of the Distribution of grades on a quiz. How many students took the quiz? What percentage of students scored below a 60 on the quiz? (Assume left-hand endpoints are included in

More information

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression

Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Lecture 12: more Chapter 5, Section 3 Relationships between Two Quantitative Variables; Regression Equation of Regression Line; Residuals Effect of Explanatory/Response Roles Unusual Observations Sample

More information

Data, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics

Data, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics Clinical Biostatistics Data, frequencies, and distributions Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Types of data Qualitative data arise when individuals

More information

Department of Statistics TEXAS A&M UNIVERSITY STAT 211. Instructor: Keith Hatfield

Department of Statistics TEXAS A&M UNIVERSITY STAT 211. Instructor: Keith Hatfield Department of Statistics TEXAS A&M UNIVERSITY STAT 211 Instructor: Keith Hatfield 1 Topic 1: Data collection and summarization Populations and samples Frequency distributions Histograms Mean, median, variance

More information

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO

DO NOT OPEN THIS BOOKLET UNTIL YOU ARE TOLD TO DO SO NATS 1500 Mid-term test A1 Page 1 of 8 Name (PRINT) Student Number Signature Instructions: York University DIVISION OF NATURAL SCIENCE NATS 1500 3.0 Statistics and Reasoning in Modern Society Mid-Term

More information

STATISTICS AND RESEARCH DESIGN

STATISTICS AND RESEARCH DESIGN Statistics 1 STATISTICS AND RESEARCH DESIGN These are subjects that are frequently confused. Both subjects often evoke student anxiety and avoidance. To further complicate matters, both areas appear have

More information

9 research designs likely for PSYC 2100

9 research designs likely for PSYC 2100 9 research designs likely for PSYC 2100 1) 1 factor, 2 levels, 1 group (one group gets both treatment levels) related samples t-test (compare means of 2 levels only) 2) 1 factor, 2 levels, 2 groups (one

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Name Date Per Key Vocabulary: response variable explanatory variable independent variable dependent variable scatterplot positive association negative association linear correlation r-value regression

More information

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES

CHAPTER 2. MEASURING AND DESCRIBING VARIABLES 4 Chapter 2 CHAPTER 2. MEASURING AND DESCRIBING VARIABLES 1. A. Age: name/interval; military dictatorship: value/nominal; strongly oppose: value/ ordinal; election year: name/interval; 62 percent: value/interval;

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

2.4.1 STA-O Assessment 2

2.4.1 STA-O Assessment 2 2.4.1 STA-O Assessment 2 Work all the problems and determine the correct answers. When you have completed the assessment, open the Assessment 2 activity and input your responses into the online grading

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the 1) Which of the following is the properly rounded mean for the given data? 7, 8, 13, 9, 10, 11 A)

More information

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet

Standard Deviation and Standard Error Tutorial. This is significantly important. Get your AP Equations and Formulas sheet Standard Deviation and Standard Error Tutorial This is significantly important. Get your AP Equations and Formulas sheet The Basics Let s start with a review of the basics of statistics. Mean: What most

More information

Knowledge discovery tools 381

Knowledge discovery tools 381 Knowledge discovery tools 381 hours, and prime time is prime time precisely because more people tend to watch television at that time.. Compare histograms from di erent periods of time. Changes in histogram

More information

(a) 50% of the shows have a rating greater than: impossible to tell

(a) 50% of the shows have a rating greater than: impossible to tell KEY 1. Here is a histogram of the Distribution of grades on a quiz. How many students took the quiz? 15 What percentage of students scored below a 60 on the quiz? (Assume left-hand endpoints are included

More information

Statistics. Nur Hidayanto PSP English Education Dept. SStatistics/Nur Hidayanto PSP/PBI

Statistics. Nur Hidayanto PSP English Education Dept. SStatistics/Nur Hidayanto PSP/PBI Statistics Nur Hidayanto PSP English Education Dept. RESEARCH STATISTICS WHAT S THE RELATIONSHIP? RESEARCH RESEARCH positivistic Prepositivistic Postpositivistic Data Initial Observation (research Question)

More information

Math 1680 Class Notes. Chapters: 1, 2, 3, 4, 5, 6

Math 1680 Class Notes. Chapters: 1, 2, 3, 4, 5, 6 Math 1680 Class Notes Chapters: 1, 2, 3, 4, 5, 6 Chapter 1. Controlled Experiments Salk vaccine field trial: a randomized controlled double-blind design 1. Suppose they gave the vaccine to everybody, and

More information

Chapter 1: Explaining Behavior

Chapter 1: Explaining Behavior Chapter 1: Explaining Behavior GOAL OF SCIENCE is to generate explanations for various puzzling natural phenomenon. - Generate general laws of behavior (psychology) RESEARCH: principle method for acquiring

More information

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology ISC- GRADE XI HUMANITIES (2018-19) PSYCHOLOGY Chapter 2- Methods of Psychology OUTLINE OF THE CHAPTER (i) Scientific Methods in Psychology -observation, case study, surveys, psychological tests, experimentation

More information

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010

OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010 OCW Epidemiology and Biostatistics, 2010 David Tybor, MS, MPH and Kenneth Chui, PhD Tufts University School of Medicine October 27, 2010 SAMPLING AND CONFIDENCE INTERVALS Learning objectives for this session:

More information

Research Methods. It is actually way more exciting than it sounds!!!!

Research Methods. It is actually way more exciting than it sounds!!!! Research Methods It is actually way more exciting than it sounds!!!! Why do we have to learn this stuff? Psychology is first and foremost a science. Thus it is based in research. Before we delve into how

More information

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points.

STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points. STT 200 Test 1 Green Give your answer in the scantron provided. Each question is worth 2 points. For Questions 1 & 2: It is known that the distribution of starting salaries for MSU Education majors has

More information

LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5

LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5 1.1 1. LOTS of NEW stuff right away 2. The book has calculator commands 3. About 90% of technology by week 5 1 Three adventurers are in a hot air balloon. Soon, they find themselves lost in a canyon in

More information

PRINCIPLES OF STATISTICS

PRINCIPLES OF STATISTICS PRINCIPLES OF STATISTICS STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

More information

Sheila Barron Statistics Outreach Center 2/8/2011

Sheila Barron Statistics Outreach Center 2/8/2011 Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when

More information