Chapter 1: Introduction to Statistics

Chapter 1: Introduction to Statistics Statistics, Science, and Observations Definition: The term statistics refers to a set of mathematical procedures for organizing, summarizing, and interpreting information. Statistics serve four general purposes: Statistics are used to organize and summarize the information so that the researcher can see what happened in the research study and can communicate the results to others. Statistics help the researcher to answer the general questions that initiated the research by determining exactly what conclusions are justified based on the results that were obtained. Statistics, Science, and Observations cont. Statistical procedures help ensure that the information or observations are presented and interpreted in an accurate and informative way. In addition, statistics provide researchers with a set of standardized techniques that are recognized and understood throughout the scientific community. 1

Populations and Samples Definition: A population is the set of all the individuals/units of interest in a particular study. Definition: A sample is a set of individuals selected from a population, usually intended to represent the population in a research study. Variables Definition: A variable is a characteristic or condition that can change or take on different values. Definition: Data (plural) are measurements or observations. A data set is a collection of measurements or observations. A datum (singular) is a single measurement or observation and is commonly called a score or raw score. 2

Parameters and Statistics When describing data it is necessary to distinguish whether the data come from a population or a sample. A characteristic that describes a population for example, the population average, is called a parameter. On the other hand, a characteristic that describes a sample is called a statistic. Thus, the average score for a sample is an example of a statistic. Typically, the research process begins with a question about a population parameter. However, the actual data come from a sample and are used to compute sample statistics. Descriptive and Inferential Statistical Methods Definition: Descriptive statistics are statistical procedures used to summarize, organize, and simplify data. Definition: Inferential statistics consist of techniques that allow us to study samples and then make generalizations about the populations from which they were selected. Definition: Sampling error is the discrepancy, or amount of error, that exists between a sample statistic and the corresponding population parameter. Statistics in the Context of Research (Example 1.1) 1) 3

Data Structures, Research Methods, and Statistics Most research is intended to examine the relationship between two or more variables. To establish the existence of a relationship, researchers must make observations-that is, measurements of the variables under study. The resulting measurements can be classified into two distinct data structures that also help to classify different research methods and different statistical techniques. In the following section (next couple of slides) we identify and discuss these two data structures. Data Structures, Research Methods, and Statistics cont. (1) Measuring two variables for each individual: The Correlational Method In the correlational method, two different variables are observed to determine whether there is a relationship between them. Occasionally, the correlational method produces scores that are not numerical values. This type of data is typically summarized in a table showing how many individuals are classified into each of the possible categories. Table 1.1 shows an example of this kind of summary table. Data Structures, Research Methods, and Statistics cont. The relationship between variables for non-numerical data, such as the data in Table 1.1, is evaluated using a statistical technique known as a chisquare test. (Chi-square tests are presented in Chapter 18. ) The results from a correlational study can demonstrate the existence of a relationship between two variables, but they do not provide an explanation for the relationship. In particular, a correlational study cannot demonstrate a cause-andeffect relationship. 4

Data Structures, Research Methods, and Statistics cont. To demonstrate a cause-and-effect relationship between two variables, researchers must use the experimental method, which is discussed in the next section. (2) Comparing two (or more) groups of scores: The Experimental Method and non-experimental methods The goal of an experimental study is to demonstrate a cause-and-effect relationship between two variables. The Experimental Method Specifically, an experiment attempts to show that changing the value of one variable will cause changes to occur in the second variable. To accomplish this goal, the experimental method has two characteristics that differentiate experiments from other types of research studies: Manipulation: The researcher manipulates one variable by changing its value from one level to another. A second variable is observed (measured) to determine whether the manipulation causes changes to occur. The Experimental Method cont. Control: The researcher must exercise control over the research situation to ensure that other, extraneous variables do not influence the relationship being examined. There are two general categories of variables that researchers must consider: Participant Variables: These are characteristics such as age, gender, and intelligence that vary from one individual to another. Environmental Variables: These are characteristics of the environment such as lighting, time of day, and weather conditions. 5

The Experimental Method cont. Researchers typically use one of the three basic techniques to control other variables: Random assignment Matching Holding them constant The Seven Factors Needed for a Classic Experimental Design Independent variable Dependent variable Control condition /group Experimental condition /group Random assignment Pre-test Post-test Other Types of Studies Other types of research studies, known as non-experimental or quasi-experimental, are similar to experiments because they also compare groups of scores. These studies do not use a manipulated variable to differentiate the groups. Instead, the variable that differentiates the groups is usually a pre-existing participant variable (such as male/female) or a time variable (such as before/after). Because these studies do not use the manipulation and control of true experiments, they cannot demonstrate cause and effect relationships. As a result, they are similar to correlational research because they simply demonstrate and describe relationships. Variables and Measurement Constructs and Operational Definitions Constructs are internal attributes or characteristics that cannot be directly observed but are useful for describing and explaining behavior. An operational definition identifies a measurement procedure (a set of operations) for measuring an external behavior and uses the resulting measurements as a definition and a measurement of a hypothetical construct. 6

Discrete and Continuous Variables Definition: A discrete variable consists of separate, indivisible categories. Thus, no values can exist between two neighboring categories. Discrete variables are commonly restricted to whole countable numbers. For example, the number of children in a family or the number of students attending class. A discrete variable may also consist of observations that differ qualitatively. For example, a psychologist observing patients may classify some as having panic disorders, others as having dissociative disorders, and some as having psychotic disorders Discrete and Continuous Variables cont. Definition: For a continuous variable, there are an infinite number of possible values that fall between any two observed values. In other words, a continuous variable is divisible into an infinite number of fractional parts. Two other factors apply to continuous variables: When measuring a continuous variable, it should be very rare to obtain identical measurements for two different individuals. Because a continuous variable has an infinite number of possible values, it should be almost impossible for two people to have exactly the same score. Discrete and Continuous Variables cont. When measuring a continuous variable, each measurement category is actually an interval that must be defined by boundaries called real limits. (upper real limit and lower real limit). 7

Scales of Measurement Nominal: A nominal scale consists of a set of categories that have different names. Measurements on a nominal scale label and categorize observations, but do not make any quantitative distinctions between observations. Ordinal: An ordinal scale consists of a set of categories that are organized in an ordered sequence. Measurements on an ordinal scale rank observations in terms of size or magnitude. Scales of Measurement cont. Interval: An interval scale consists of ordered categories that are all intervals of exactly the same size. Equal differences between numbers on scale reflect equal differences in magnitude. However, the zero point on an interval scale is arbitrary and does not indicate a zero amount of the variable being measured. Ratio: A ratio scale is an interval scale with the additional feature of an absolute zero point. With a ratio scale, ratios of numbers do reflect ratios of magnitude. Order of Mathematical Operations Any calculation contained within parentheses is done first. Squaring (or raising to other exponents) is done second. Multiplying and/or dividing is done third. A series of multiplication l and/or division operations should be done in order from left to right. Summation using the notation is done next. Finally, any other addition and/or subtraction is done. 8