CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS
|
|
- Abner Thomas
- 5 years ago
- Views:
Transcription
1 CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS Chapter Objectives: Understand Null Hypothesis Significance Testing (NHST) Understand statistical significance and probabilities (p-values) Understand Type I and Type II Errors in hypothesis testing Understand t-tests, ANOVA / MANOVA, and ANCOVA / MANCOVA and how they are used in analyzing experiments and quasi-experiments. Understand Effect Size Estimates Understand the difference between statistical and practical significance Understand Power Analysis Understand how validity (quality) of experiments and quasi-experiments is evaluated
2 Understand internal validity and threats Understand statistical conclusion validity I. Data Analysis Data analysis in experimental and quasi experiment research is a multilayered process that begins with collecting data on the dependent variable after the experiment has been implemented and ends with making inferences about the statistical soundness and the meaning of the results. There are three complementary approaches to statistical analysis. Null Hypothesis Significance Testing (NHST) is the term applied to the set of procedures that (1) establishes statistical significance of results and (2) confirms or disconfirms an experiment s hypothesis. Effect Size refers to the magnitude of differences on the dependent variable after a treatment Power Analysis refers to the probability of avoiding errors in conclusions drawn from NHST.
3 Null Hypothesis Significance Testing: Significance Testing Statistical significance is a low probability that results are due to sampling error or chance and a high probability that results are due to the treatment. Significance testing is the first step in NHST. It is important to note that significance as used here does not mean importance. It merely reports on the probability that outcomes are not due to error, and thus affirms that the outcomes are real and are due to a treatment. Significance is based on probability. p represents the level of probability that the differences are due to various sources of error (e.g., sampling). In most cases the lower the p- value, the better. Alpha is the p-value that the researcher sets at the beginning of the study as an acceptable level of probability that differences are due to sampling error.
4 p.05 is the conventional alpha used in educational research. This means there is less than a 5% probability the differences are due to sampling error and a high probability the differences are due to the treatment. Probabilities are derived from the normal distribution curve. Remember from Chapter Six that 95% of values will randomly occur within the first two standard deviations (plus or minus 2 standard deviations) on the curve. Figure 1. Normal Distribution Curve p=.05 By establishing alpha at.05, the researcher is setting a standard that rules out the probability that 95% of randomly occurring results (those occurring within the shaded area) are due to sampling error or chance. Quantitative researchers use inferential statistical tests to arrive at a p-value. Inferential statistics use probability theory to make inferences from data about the likelihood that results occurred by chance and whether the
5 results can be generalized from the subjects in the sample to all of the subjects in the population from which the sample was drawn. The inferential tests use the means and standard deviations on measured outcomes to calculate the statistical significance in experiments are the t-test, ANOVA / MANOVA, and ANCOVA / MANCOVA. t-tests t-tests compare outcomes of two groups or two variables. There are two types of t-tests: 1) t-tests for independent means that analyze the difference in means between two groups and 2) t-test for correlated means that analyze differences in means for the same group before and after a treatment. The t-test yields a t-value that leads to a p value. For example, in a post-test only/ true experiment with randomly assigned treatment and control groups, the researcher set alpha at p.05. He uses a t-test to compare math scores of students who were instructed with an innovative method to those who had traditional instruction that yielded t He then derives a p-value from the value to see if the alpha has been achieved. The summary data display will look like this: N=60 Treatment Control T Mean * SD df = 58 * p =.05
6 The number of subjects in the study is designated by n, n=60. The (*) that appears next to the t-value directs us to the bottom of the chart where we see df (58); this stands for degrees of freedom (the number of subjects number of groups: 60-2) and is used to find p =.05 on a Table of Critical t-values table of t- values. A segment of that table for degrees of freedom from 40 to 75 is presented below. df Probability value (p) Table 1. Critical t-values (df = 40-75) In our example df=58. When we look down the table we look at df=60, which is close enough. When we scan across the table for a p=.05, the t-value listed is Any t-value above that would signify statistical significance. Since t=2.29 in our example, that indicates that the results meet the criteria for.05 and are significant.
7 ANOVA / MANOVA ANOVA (Analysis of Variance) and MANOVA (Multivariate Analysis of Variance) also compare outcomes between groups; ANOVA is more robust than t-tests in that it can compare outcomes on two or more groups. ANOVA is used to analyze differences between two or more variables and differences within groups. MANOVA (Multivariate Analysis of Variance) is used when testing more than two groups when there are more than two dependent variables. Both the ANOVA and MANOVA yield an F-value that leads to a p-value. For example a researcher uses ANOVA to determine whether there are differences in results from three approaches to reinforcing skills in math. The sample is selected from two fourth grade classrooms in a school; students are randomly assigned to three groups after receiving direct instruction. Group # 1 works in cooperative leaning groups; group #2 receives instruction from an interactive computer program; and group #3 uses concept mapping strategies Group #1 Cooperative Group #2 Computer Group #3 Concept Maps (n = 15) (n = 14) (n = 15) Mean Scores * SD *F (7.14); p =. 002 The ANOVA generated a critical F-value of 7.14, which resulted in p =.002, which is below the.05 alpha. The results are statistically significant for the concept-mapping group. The conversion from the F to p is based a Table of
8 Critical F-values that, like the table of t-values, lists degrees of freedom and p- values. ANCOVA / MANCOVA ANCOVA (Analysis of Covariance) and MANCOVA (Multiple Analysis of Co Variance) are predominantly used in quasi-experimental studies, when there is a likelihood that the groups are non-equivalent at the start; they are also used in other experimental designs in order to ensure the equivalence of the groups. These tests provide more certainly that the outcomes of a quasi experiment are not being affected by the differences, or variances, that exist in the subjects before the experiment begins. An ANCOVA or MANCOVA is performed at the end of the experiment to level the playing field; it is like a golf handicap that accommodates differences in competitors before the round begins. ANCOVA and MANCOVA yield a F-value that leads to a p-value. For example, in a quasi-experimental study that uses two intact classrooms to investigates the effect of an innovative approach to language arts on the two variables of writing and reading scores, the researcher uses MANCOVA to level the playing field between the two non-equivalent groups. The display of data will look like this:
9 Pre Test Scores Writing Reading Treatment Control M SD M SD Post Test Scores Writing Reading M SD M SD *F (11); p=.0018 ** F (20); p=.0001 The MANCOVA generated critical F-values of 11 and 20, which led to p=.0018 and p= These values are below the.05 alpha; the results are statistically significant for both reading and writing. These F-values tell you that there are statistically significant differences they do not specify exactly which groups are different; researchers follow up with t-tests to make these comparisons. One- and Two-Tailed Tests Researchers chose between using a one or two tailed inferential test. A one tailed test uses one end, or tail, of the curve to generate a p.05; a twotailed test uses two ends, or tails, with p.025 on each end of the curve of the curve, to generate p.05. The illustration below shows these probabilities.
10 Figure 2. One tailed tests at p=.05 Figure 3. Two-tailed test The most objective test is the two-tailed test. It represents a higher standard and degree of difficulty for results because the probability must be distributed to both
11 ends of the curve. A one-tailed test is more lenient because all of the probability for the hypothesis test is on one side of the distribution curve. In the final analysis what matters is the probability (p) value, and moreover, a theory that provides a rationale for making a well-founded prediction. Null Hypothesis Significance Testing: Hypothesis Testing Hypothesis testing is the next step in process of null hypothesis significance testing (NHST) and uses statistical significance to confirm or disconfirm a hypothesis. As a reminder, here are the definitions that are useful in this discussion. The hypothesis operationalizes the theory in terms of the independent and dependent variables. A hypothesis makes a prediction about the effect of the independent variable on the dependent variable and can be stated as either a directional or non-directional hypothesis.
12 The directional hypothesis predicts that the treatment will result in a change on and that the change will be a positive result of the experiment. The non-directional hypothesis predicts that a treatment will result in a change in outcomes, but does not predict the direction of the change: whether it will be positive or negative. In hypothesis testing, the researcher uses a construct known as the null hypothesis. The null hypothesis predicts there will be no significant difference between the experimental group(s) and control group(s) as a result of the treatment. To confirm that a hypothesis is true, the research has to demonstrate that the null is false. This is where significance testing comes into play. The researcher uses the p-value derived from significance testing as the probability level that the null is true. If p.05 there is little probability the null is true; therefore the researcher can reject the null hypothesis and confirm that the hypothesis is true. In effect, the null hypothesis is set up as a straw man; it is easier to prove something false than it is to prove something true. Below is a graphic that summarizes the multi-layered process of data analysis through NHST.
13 Summary of Null Hypothesis Significance Testing Figure 3. Summary of Null Hypothesis Significance Testing Type I and Type II Errors Hypothesis testing is often used in decision-making, and researchers have to guard against making inferential errors.
14 Type-I error means that the researcher erroneously rejects the null hypothesis. This is a false positive that leads to the incorrect conclusion that the hypothesis has been confirmed. Type-II error occurs researcher erroneously accepts the null hypothesis. This is a false positive that leads to the incorrect conclusion that the hypothesis has not been confirmed. Figure 4. Type I and Type II Errors; Decision-making about the null hypothesis A Type I error is called the alpha error because it usually occurs when the alpha may have set too high (p=. 05), leaving too much room for error. The researcher can avoid this error by setting a lower alpha (p=.025 or p.01). The Type II Error is called the Beta error. To avoid this error, the researcher has to follow a set of
15 statistical procedures before the study begins. This is called power analysis and is described in the section below. Effect Size Estimates and Power Analysis NHST is not without its critics. As Thompson stated, "statistical significance is not sufficiently useful to be invoked as the sole criterion for evaluating the noteworthiness" of research (2002a, p. 66). Statistical significance does tell whether differences exist, but it does not tell the size, or magnitude, of those differences; nor does it provide insight into the ultimate meaningfulness of the research. And it may obscure Type I and Type II Errors. Researchers use effect size estimates and power analysis to address those concerns. Effect Size Estimates Effect size estimates convey "the magnitude of an effect or the strength of a relationship" (APA, 2001, p. 25); they are meaningful estimations of impact. Effect Size Estimates provide information that statistical significance does not; they provide evidence about what is termed practical significance.
16 Practical significance answers the question: Are the differences big enough to have real meaning and to guide decision-making in practical situations? Expressed as ES or d, effect size is an essential statistical calculation in social research. The fifth edition of the APA Publication Manual (2001) states, The general principle to be followed is to provide the reader not only with information about statistical significance but also with enough information to assess the magnitude of the observed effect or relationship (pp ). There are several ways to calculate Effect Size. The formula commonly used in educational research is called the Glass Δ (delta). This formula simply calculates the difference between the means of the treatment and control groups and divides the answer by the standard deviation of the control group (Glass, 1976; Cohen, 1968). Glass Δ = Mean (experiment) Mean (control) Standard deviation (control) The Glass Δ can be translated into units of standard deviation gain: the greater the effect size, the greater the gain in SD units. By way of explanation: an effect size of 1.0 is equivalent to a one standard deviation change in outcomes on a normal distribution curve. An effect size of 0.5 is equivalent to ½ standard deviation change; this would mean that the average pupil experiencing a treatment would improve by almost one half ( ½ ) a standard deviation on a standardized measure. For example a student would move from the 50 th
17 percentile to the 67 th percentile on the math SAT, from a score of around 520 to a score of 570, as the result of a test preparation program. Cohen developed the following guidelines for interpreting effect size. ES=0.5 is considered adequate for establishing difference of sufficient magnitude < 0.1 = trivial effect (1/10 SD gain) = small effect (up to ⅓ SD gain) = moderate effect (up to ⅕ SD gain) > 0.5 = large effect (above ½ SD gain) Just as researchers usually have as their goal to achieve an alpha of p.05, they usually strive for ES = 0.5. Effect Size is not only used to determine practical significance; it is also used in calculating statistical power. Statistical Power and Power Analysis Statistical Power tells the likelihood that the researcher will avoid making a Type II error; accepting as true a null hypothesis that is false. Power is expressed as a statistic of probability; the higher the power, the greater the probability of avoiding error. A power analysis allows a researcher to calculate the sample size that will avoid a Type II error, given the desired alpha and effect size. A power of 0.80 is considered the acceptable threshold for avoiding error. Poor power cannot be corrected after an experiment is completed; to have use, it must occur before the experiment begins.
18 1. Before the experiment begins, the research sets the desired alpha at.05 and the desired effect size at The researcher conducts a statistical power analysis to determine the sample size necessary to reach power= The power analysis determines that a sample size of 65 subjects for each experimental group would be adequate to reach a 0.80 power. 4. If the researcher cannot have access to a sample of that size, she can increase power by lowering the alpha to.025, by raising the ES to 0.80, or by doing both. The concept of power is crucial to the conduct of responsible and sound research. An understanding of statistical power enables educators as consumers of research to ask the right questions about what the research says and to make informed judgments about effective interventions they might use in their own practice. II. Validity of True and Quasi Experiments
19 In evaluating the quality of an experiment, a reader has to consider the overall validity of the study Validity is the approximate truth of propositions, inferences, or conclusions (Trochim). There are three types of validity that taken together lead to the overall validity of an experiment. Conclusion validity answers the question: Is there a relationship between the independent and dependent variables? Internal validity answers the question: Is the relationship causal? External validity answers the question: Can we generalize findings to other people and settings? There are barriers to achieving each validity component that researchers have to be mindful of before they can make a judgment about the quality of the validity profile of the study. Threats to validity are factors that interfere with a study and compromise our confidence about its results. Conclusion Validity Conclusion Validity is the degree that it is reasonable to conclude that there is a relationship between variables. Threats to conclusion validity compromise our confidence that a relationship does exist between variables. These threats include the following. Low reliability of measures: reliability < 0.7
20 Low statistical power: inadequate sample size, alpha set too low Fishing : analyzing and re-analyzing data; making multiple comparisons with the aim of finding significant results Mismatch of statistics to sample characteristics Internal Validity Internal validity concerns the level of control over causation that the researcher has in an experiment. It is the degree to which results are due to a causal relationship between variables and that the effect of the IV on the DV is not due to any variables (extraneous variables) other than the independent variable. An experiment that has strong internal validity can unambiguously attribute the effects on the dependent variable to the treatment of the independent variable Threats to internal validity compromise our confidence in saying that a causal relationship exists between the independent and dependent variables The most common threats to internal validity are the following. History: An unanticipated event occurs while the experiment is in progress Maturation: Normal developmental processes that affect subjects differently over time Statistical Regression: Subjects at the extremes regress to the mean during post-tests Selection: The groups are not equivalent at the beginning of the study
21 Mortality: Subjects drop out of the study Testing: The pre-test sensitizes subjects to the post-test Instrumentation: Changes in the way the dependent variable is measured Compensatory Rivalry ( The John Henry Effect ): Social competition motivates a group to over-perform and mask treatment effects External Validity External validity (or generalizability) is the degree to which the findings can be applied to other people, settings, and times. External validity can be approached in two ways: as population validity or as ecological validity. Population validity is the degree to which the results of an experiment can be generalized to individuals who were not in the study. Ecological validity is the extent to which the results of an experiment can be generalized to different settings. Threats to external validity compromise confidence in stating whether the study s results are applicable to other people and settings. These threats include the following: For population validity: not having a representative sample, randomly selected from a target population. For ecological validity: o The Hawthorne effect (also called reactivity): Outcomes are due to the reaction of subjects to being studied and are not due to the treatment.
22 o Experimenter effect: Outcomes are due to the characteristics of the person conducting the study. o Insufficient description of the conditions of the experiments: setting, treatment, and measurement Evaluating Validity In evaluating the various validities of a between group experiments and quasi-experiments is a complex undertaking. We recommend research consumers form a judgment by a using a framework that considers three criteria: 1) theory and treatment, 2) sampling, and 3) measurement. Theory and Treatment: Fidelity to a theory that is supported by a well-referenced research review and is not subject to modification (through fishing ) enhances confidence about conclusion validity. A solid theory that is supported by a well-referenced research review and leads to a hypothesis that clearly states a causal relationship between variables enhances confidence about internal and external validity A theory-based treatment that is clearly described and strictly implemented enhances confidence in internal and external validity.
23 Sampling: Adequate sample size and a good match between the sample and statistical analysis enhances confidence about conclusion validity A detailed description of people in the sample (how many and subject characteristics) enhances confidence about internal and external validity. A detailed description of the setting enhances confidence about internal and ecological external validity Control over sampling threats enhances confidence about internal and external validity. Random assignment of the sample to the treatment condition enhances confidence of internal and external validity Random selection of the sample from a population enhances confidence enhances population external validity. Measurement: Reliable measures (r = < 0.7) enhance confidence in conclusion internal, and external validity. Valid measures enhance confidence in internal and external validity. Consistent measures enhance confidence in internal and external validity
24 The table below summarizes these criteria and may serve as a template for evaluating validity.
25 Criterion Theory and Treatment Conclusion Validity Clear statement of hypothesis that predicts how IV will affect DV Internal Validity Well referenced research review leading to a solid causal theory or framework External Validity (Population) Well referenced research review leading to a solid causal theory or framework External Validity (Ecological) Well referenced research review leading to a solid causal theory or framework No evidence of fishing Clear statement of hypothesis that predicts how IV will affect DV Evidence of history threat or testing threat? Clear statement of hypothesis that predicts how IV will affect DV Clear statement of hypothesis that predicts how IV will affect DV Detailed description of the treatment and conditions of the study, including history threat Sampling Adequate sample size determined by power analysis and match of sample to statistics Adequate sample size determined by power analysis to avoid Type I Error. Detailed description of sample (size and subject characteristics) Evidence of Hawthorn Effect Experimenter Effect, or insufficient description? Detailed description of setting Measurement Reliability of measures.7 Detailed description of sample (size and subject characteristics) and setting. Random assignment to treatment condition Evidence of maturation, selection, regression, or compensatory rivalry threats? Use of reliable, valid, and consistent measures for DV Evidence of instrumentation threat? Random selection from a population and random assignment to treatment Use of reliable, valid, and consistent measures for DV Random assignment to treatment Evidence of Hawthorn Effect, Experimenter Effect or insufficient description of the setting? Use of reliable, valid, and consistent measures for DV Rating H M L H M L H M L H M L
26 Summary of Criteria for Evaluating Studies Summary There are three complementary approaches to statistical analysis in experimental and quasi experimental research: Null Hypothesis Significance Testing (NHST), Effect Size, and Power Analysis Inferential statistics build on descriptive statistics (means and standard deviations) to determine the likelihood that results occurred by chance and whether the results can be generalized.
27 Inferential tests that are used in experiments and quasi-experiments are t- tests, ANOVA/MANOVA, and ANCOVE/MANCOVA Inferential tests yield values that are converted to probabilities of results having occurred by chance. Experiments and quasi-experiments are evaluated for internal validity, statistical conclusion validity and external validity (defined as ecological validity) There are several threats to internal validity that the researcher seeks to control. Theory/treatment, sampling, and measurement are key elements in evaluating validity of experiments and quasi-experiments. Terms and Concepts NHST Effect size estimates p-value/probability t-test ANCOVA/MANCOVA degree of freedom null hypothesis internal validity statistical significance power analysis alpha ANOVA/MANOVA table of critical values one and two tailed tests Type I and Type II Errors threats to internal validity statistical conclusion validity fishing external validity population validity ecological validity
28 Review, Consolidation, and Extension of Knowledge 1. In your own words, explain null hypothesis significance testing. 2. In your own explain Effect Size Estimates and Power Analysis. 3. In your own words explain the difference between statistical and practical significance. 4 Read the data analysis/results sections section of the experimental study you chose in the previous chapter and complete the Guide below. 5. Use the Guide as a template for writing a critique about 1,000 words of the experimental or quasi-experimental between group study you selected. See the Appendix for an exemplar. Guide to Reading and Critiquing an Experimental and Quasi- Experimental Group Study Research Review and Theory: What is the purpose of the research review? Does it establish an underlying theory (big ideas) for the research? Purpose and Design:
29 What is the purpose of the study? Is there a hypothesis or a research question? If so, what is it? If not, can you infer the question from the text of the article? What is the basic research design and type? What are the dependent and independent variables? Identify each type of variable in the study. ( IV=, DV=) Sampling: How is the sample selected? How is the sample assigned to the treatment condition(s): random or nonrandom/intact group? Who is in the sample? What are the characteristics of the sample? What is the sample size? Data Collection: What measures are used for the dependent variable? Are these standardized measures? Adapted measures? Newly-created measures? What is the format of the measure (s) Are there indications of validity and reliability of the measures? What are they ( r-values)? Data Analysis and Results: What statistical tests are used to analyze the data? Were the results (p-values) significant or non-significant? What does the researcher conclude about the findings?
30 Evaluation of Validity: How do you evaluate internal validity? What are threats to internal validity? How do you evaluate statistical conclusion validity? What is your rationale? How do you evaluate external population validity? Ecological validity?what is you rationale?
CHAPTER ELEVEN. NON-EXPERIMENTAL RESEARCH of GROUP DIFFERENCES
CHAPTER ELEVEN NON-EXPERIMENTAL RESEARCH of GROUP DIFFERENCES Chapter Objectives: Understand how non-experimental studies of group differences compare to experimental studies of groups differences. Understand
More information04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing
Research Methods in Psychology Chapter 6: Independent Groups Designs 1 Why Psychologists Conduct Experiments? What is your ideas? 2 Why Psychologists Conduct Experiments? Testing Hypotheses derived from
More informationResearch Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables?
Research Questions, Variables, and Hypotheses: Part 2 RCS 6740 6/7/04 1 Review What are research questions? What are variables? Definition Function Measurement Scale 2 Hypotheses OK, now that we know how
More informationExperimental Research I. Quiz/Review 7/6/2011
Experimental Research I Day 3 Quiz/Review Quiz Review Normal Curve z scores & T scores More on the normal curve and variability... Theoretical perfect curve. Never happens in actual research Mean, median,
More informationIn this chapter we discuss validity issues for quantitative research and for qualitative research.
Chapter 8 Validity of Research Results (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) In this chapter we discuss validity issues for
More informationExperimental Design Part II
Experimental Design Part II Keith Smolkowski April 30, 2008 Where Are We Now? esearch eview esearch Design: The Plan Internal Validity Statements of Causality External Validity Statements of Generalizability
More informationExperimental Research. Types of Group Comparison Research. Types of Group Comparison Research. Stephen E. Brock, Ph.D.
Experimental Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Types of Group Comparison Research Review Causal-comparative AKA Ex Post Facto (Latin for after the fact).
More information26:010:557 / 26:620:557 Social Science Research Methods
26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview
More informationHypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC
Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric statistics
More information9 research designs likely for PSYC 2100
9 research designs likely for PSYC 2100 1) 1 factor, 2 levels, 1 group (one group gets both treatment levels) related samples t-test (compare means of 2 levels only) 2) 1 factor, 2 levels, 2 groups (one
More informationVALIDITY OF QUANTITATIVE RESEARCH
Validity 1 VALIDITY OF QUANTITATIVE RESEARCH Recall the basic aim of science is to explain natural phenomena. Such explanations are called theories (Kerlinger, 1986, p. 8). Theories have varying degrees
More informationResearch Design. Source: John W. Creswell RESEARCH DESIGN. Qualitative, Quantitative, and Mixed Methods Approaches Third Edition
Research Design Source: John W. Creswell RESEARCH DESIGN Qualitative, Quantitative, and Mixed Methods Approaches Third Edition The Three Types of Designs Three types Qualitative research Quantitative research
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationExperimental Psychology
Title Experimental Psychology Type Individual Document Map Authors Aristea Theodoropoulos, Patricia Sikorski Subject Social Studies Course None Selected Grade(s) 11, 12 Location Roxbury High School Curriculum
More informationOne-Way Independent ANOVA
One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.
More informationCHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO:
3 Chapter 8 Introducing Inferential Statistics CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO: Explain the difference between descriptive and inferential statistics. Define the central limit theorem and
More informationCHAPTER THIRTEEN. Data Analysis and Interpretation: Part II.Tests of Statistical Significance and the Analysis Story CHAPTER OUTLINE
CHAPTER THIRTEEN Data Analysis and Interpretation: Part II.Tests of Statistical Significance and the Analysis Story CHAPTER OUTLINE OVERVIEW NULL HYPOTHESIS SIGNIFICANCE TESTING (NHST) EXPERIMENTAL SENSITIVITY
More informationInferential Statistics
Inferential Statistics and t - tests ScWk 242 Session 9 Slides Inferential Statistics Ø Inferential statistics are used to test hypotheses about the relationship between the independent and the dependent
More informationReadings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14
Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14 Still important ideas Contrast the measurement of observable actions (and/or characteristics)
More informationStill important ideas
Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement
More informationPRINCIPLES OF STATISTICS
PRINCIPLES OF STATISTICS STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationinvestigate. educate. inform.
investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced
More informationTwo-Way Independent ANOVA
Two-Way Independent ANOVA Analysis of Variance (ANOVA) a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment. There
More informationChapter 11. Experimental Design: One-Way Independent Samples Design
11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing
More informationReadings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F
Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions
More informationChapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research
Chapter 11 Nonexperimental Quantitative Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) Nonexperimental research is needed because
More informationStill important ideas
Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence Key Vocabulary: point estimator point estimate confidence interval margin of error interval confidence level random normal independent four step process level C confidence
More informationOne-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;
1 One-Way ANOVAs We have already discussed the t-test. The t-test is used for comparing the means of two groups to determine if there is a statistically significant difference between them. The t-test
More informationClever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.
Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time. While a team of scientists, veterinarians, zoologists and
More informationEXPERIMENTAL RESEARCH DESIGNS
ARTHUR PSYC 204 (EXPERIMENTAL PSYCHOLOGY) 14A LECTURE NOTES [02/28/14] EXPERIMENTAL RESEARCH DESIGNS PAGE 1 Topic #5 EXPERIMENTAL RESEARCH DESIGNS As a strict technical definition, an experiment is a study
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationCHAPTER LEARNING OUTCOMES
EXPERIIMENTAL METHODOLOGY CHAPTER LEARNING OUTCOMES When you have completed reading this article you will be able to: Define what is an experiment Explain the role of theory in educational research Justify
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More information3 CONCEPTUAL FOUNDATIONS OF STATISTICS
3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical
More informationChapter 02 Developing and Evaluating Theories of Behavior
Chapter 02 Developing and Evaluating Theories of Behavior Multiple Choice Questions 1. A theory is a(n): A. plausible or scientifically acceptable, well-substantiated explanation of some aspect of the
More informationApplied Statistical Analysis EDUC 6050 Week 4
Applied Statistical Analysis EDUC 6050 Week 4 Finding clarity using data Today 1. Hypothesis Testing with Z Scores (continued) 2. Chapters 6 and 7 in Book 2 Review! = $ & '! = $ & ' * ) 1. Which formula
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 5, 6, 7, 8, 9 10 & 11)
More informationCHAPTER TEN SINGLE-SUBJECTS (QUASI-EXPERIMENTAL) RESEARCH
CHAPTER TEN SINGLE-SUBJECTS (QUASI-EXPERIMENTAL) RESEARCH Chapter Objectives: Understand the purpose of single subjects research and why it is quasiexperimental Understand sampling, data collection, and
More informationEmpirical Knowledge: based on observations. Answer questions why, whom, how, and when.
INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group
More informationThe validity of inferences about the correlation (covariation) between treatment and outcome.
Threats Summary (MGS 9940, Fall, 2004-Courtesy of Amit Mehta) Statistical Conclusion Validity The validity of inferences about the correlation (covariation) between treatment and outcome. Threats to Statistical
More informationCHAPTER 4 RESULTS. In this chapter the results of the empirical research are reported and discussed in the following order:
71 CHAPTER 4 RESULTS 4.1 INTRODUCTION In this chapter the results of the empirical research are reported and discussed in the following order: (1) Descriptive statistics of the sample; the extraneous variables;
More informationSTATISTICAL CONCLUSION VALIDITY
Validity 1 The attached checklist can help when one is evaluating the threats to validity of a study. VALIDITY CHECKLIST Recall that these types are only illustrative. There are many more. INTERNAL VALIDITY
More informationPLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity
PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity Measurement & Variables - Initial step is to conceptualize and clarify the concepts embedded in a hypothesis or research question with
More information2-Group Multivariate Research & Analyses
2-Group Multivariate Research & Analyses Research Designs Research hypotheses Outcome & Research Hypotheses Outcomes & Truth Significance Tests & Effect Sizes Multivariate designs Increased effects Increased
More informationPsychology Research Process
Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:
More informationDesign of Experiments & Introduction to Research
Design of Experiments & Introduction to Research 1 Design of Experiments Introduction to Research Definition and Purpose Scientific Method Research Project Paradigm Structure of a Research Project Types
More information9. Interpret a Confidence level: "To say that we are 95% confident is shorthand for..
Mrs. Daniel AP Stats Chapter 8 Guided Reading 8.1 Confidence Intervals: The Basics 1. A point estimator is a statistic that 2. The value of the point estimator statistic is called a and it is our "best
More informationAssignment 4: True or Quasi-Experiment
Assignment 4: True or Quasi-Experiment Objectives: After completing this assignment, you will be able to Evaluate when you must use an experiment to answer a research question Develop statistical hypotheses
More informationTHE INTERPRETATION OF EFFECT SIZE IN PUBLISHED ARTICLES. Rink Hoekstra University of Groningen, The Netherlands
THE INTERPRETATION OF EFFECT SIZE IN PUBLISHED ARTICLES Rink University of Groningen, The Netherlands R.@rug.nl Significance testing has been criticized, among others, for encouraging researchers to focus
More informationValidity and Quantitative Research. What is Validity? What is Validity Cont. RCS /16/04
Validity and Quantitative Research RCS 6740 6/16/04 What is Validity? Valid Definition (Dictionary.com): Well grounded; just: a valid objection. Producing the desired results; efficacious: valid methods.
More informationExperiments. Outline. Experiment. Independent samples designs. Experiment Independent variable
Experiments 1 Outline Experiment Independent variable Factor Levels Condition Dependent variable Control variable Internal vs external validity Independent samples designs Why / Why not? Carry-over effects
More informationPsychology Research Process
Psychology Research Process Logical Processes Induction Observation/Association/Using Correlation Trying to assess, through observation of a large group/sample, what is associated with what? Examples:
More informationSheila Barron Statistics Outreach Center 2/8/2011
Sheila Barron Statistics Outreach Center 2/8/2011 What is Power? When conducting a research study using a statistical hypothesis test, power is the probability of getting statistical significance when
More informationAppendix B Statistical Methods
Appendix B Statistical Methods Figure B. Graphing data. (a) The raw data are tallied into a frequency distribution. (b) The same data are portrayed in a bar graph called a histogram. (c) A frequency polygon
More informationChapter 12: Introduction to Analysis of Variance
Chapter 12: Introduction to Analysis of Variance of Variance Chapter 12 presents the general logic and basic formulas for the hypothesis testing procedure known as analysis of variance (ANOVA). The purpose
More informationCHAPTER ONE CORRELATION
CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to
More informationExperimental Psychology Arlo Clark Foos
Inferential Statistics Experimental Psychology Arlo Clark Foos Descriptive vs. Inferential Stats Descriptive Inferential Inferential Statistics Did your IV have an effect? By chance alone Large vs. Small
More information1 The conceptual underpinnings of statistical power
1 The conceptual underpinnings of statistical power The importance of statistical power As currently practiced in the social and health sciences, inferential statistics rest solidly upon two pillars: statistical
More informationControlled Experiments
CHARM Choosing Human-Computer Interaction (HCI) Appropriate Research Methods Controlled Experiments Liz Atwater Department of Psychology Human Factors/Applied Cognition George Mason University lizatwater@hotmail.com
More informationChapter 9 Experimental Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.
Chapter 9 Experimental Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) In this chapter we talk about what experiments are, we
More informationChapter 5: Research Language. Published Examples of Research Concepts
Chapter 5: Research Language Published Examples of Research Concepts Contents Constructs, Types of Variables, Types of Hypotheses Note Taking and Learning References Constructs, Types of Variables, Types
More informationResearch Designs. Inferential Statistics. Two Samples from Two Distinct Populations. Sampling Error (Figure 11.2) Sampling Error
Research Designs Inferential Statistics Note: Bring Green Folder of Readings Chapter Eleven What are Inferential Statistics? Refer to certain procedures that allow researchers to make inferences about
More informationThreats to validity in intervention studies. Potential problems Issues to consider in planning
Threats to validity in intervention studies Potential problems Issues to consider in planning An important distinction Credited to Campbell & Stanley (1963) Threats to Internal validity Threats to External
More informationStatistical Significance, Effect Size, and Practical Significance Eva Lawrence Guilford College October, 2017
Statistical Significance, Effect Size, and Practical Significance Eva Lawrence Guilford College October, 2017 Definitions Descriptive statistics: Statistical analyses used to describe characteristics of
More informationAnalysis of Variance (ANOVA)
Research Methods and Ethics in Psychology Week 4 Analysis of Variance (ANOVA) One Way Independent Groups ANOVA Brief revision of some important concepts To introduce the concept of familywise error rate.
More informationProfile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth
Profile Analysis Intro and Assumptions Psy 524 Andrew Ainsworth Profile Analysis Profile analysis is the repeated measures extension of MANOVA where a set of DVs are commensurate (on the same scale). Profile
More informationCHAPTER EIGHT EXPERIMENTAL RESEARCH: THE BASICS of BETWEEN GROUP DESIGNS
CHAPTER EIGHT EXPERIMENTAL RESEARCH: THE BASICS of BETWEEN GROUP DESIGNS Chapter Objectives: Understand that the purpose of experiments and group quasi-experiments is to investigate differences between
More informationStudent Performance Q&A:
Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of
More informationSix Sigma Glossary Lean 6 Society
Six Sigma Glossary Lean 6 Society ABSCISSA ACCEPTANCE REGION ALPHA RISK ALTERNATIVE HYPOTHESIS ASSIGNABLE CAUSE ASSIGNABLE VARIATIONS The horizontal axis of a graph The region of values for which the null
More informationUNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4
UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4 Algebra II Unit 4 Overview: Inferences and Conclusions from Data In this unit, students see how the visual displays and summary statistics they
More informationChapter 2--Norms and Basic Statistics for Testing
Chapter 2--Norms and Basic Statistics for Testing Student: 1. Statistical procedures that summarize and describe a series of observations are called A. inferential statistics. B. descriptive statistics.
More informationChapter Three: Sampling Methods
Chapter Three: Sampling Methods The idea of this chapter is to make sure that you address sampling issues - even though you may be conducting an action research project and your sample is "defined" by
More informationDesigns. February 17, 2010 Pedro Wolf
Designs February 17, 2010 Pedro Wolf Today Sampling Correlational Design Experimental Designs Quasi-experimental Design Mixed Designs Multifactioral Design Sampling Overview Sample- A subset of a population
More informationChapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.
Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able
More informationSomething to think about. What happens, however, when we have a sample with less than 30 items?
One-Sample t-test Remember In the last chapter, we learned to use a statistic from a large sample of data to test a hypothesis about a population parameter. In our case, using a z-test, we tested a hypothesis
More informationPsy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics
Psy201 Module 3 Study and Assignment Guide Using Excel to Calculate Descriptive and Inferential Statistics What is Excel? Excel is a spreadsheet program that allows one to enter numerical values or data
More informationCHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to
CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest
More informationUse of the Quantitative-Methods Approach in Scientific Inquiry. Du Feng, Ph.D. Professor School of Nursing University of Nevada, Las Vegas
Use of the Quantitative-Methods Approach in Scientific Inquiry Du Feng, Ph.D. Professor School of Nursing University of Nevada, Las Vegas The Scientific Approach to Knowledge Two Criteria of the Scientific
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationPsychology 205, Revelle, Fall 2014 Research Methods in Psychology Mid-Term. Name:
Name: 1. (2 points) What is the primary advantage of using the median instead of the mean as a measure of central tendency? It is less affected by outliers. 2. (2 points) Why is counterbalancing important
More informationINADEQUACIES OF SIGNIFICANCE TESTS IN
INADEQUACIES OF SIGNIFICANCE TESTS IN EDUCATIONAL RESEARCH M. S. Lalithamma Masoomeh Khosravi Tests of statistical significance are a common tool of quantitative research. The goal of these tests is to
More information6. A theory that has been substantially verified is sometimes called a a. law. b. model.
Chapter 2 Multiple Choice Questions 1. A theory is a(n) a. a plausible or scientifically acceptable, well-substantiated explanation of some aspect of the natural world. b. a well-substantiated explanation
More informationStatistics for Psychology
Statistics for Psychology SIXTH EDITION CHAPTER 3 Some Key Ingredients for Inferential Statistics Some Key Ingredients for Inferential Statistics Psychologists conduct research to test a theoretical principle
More informationPsychology 2019 v1.3. IA2 high-level annotated sample response. Student experiment (20%) August Assessment objectives
Student experiment (20%) This sample has been compiled by the QCAA to assist and support teachers to match evidence in student responses to the characteristics described in the instrument-specific marking
More informationChoosing designs and subjects (Bordens & Abbott Chap. 4)
Choosing designs and subjects (Bordens & Abbott Chap. 4) Once we have examined all the nitty-gritty details of a study (e.g., variables, variable levels), it is time to conceptually organize the details
More informationLecture 4: Research Approaches
Lecture 4: Research Approaches Lecture Objectives Theories in research Research design approaches ú Experimental vs. non-experimental ú Cross-sectional and longitudinal ú Descriptive approaches How to
More informationWriting Reaction Papers Using the QuALMRI Framework
Writing Reaction Papers Using the QuALMRI Framework Modified from Organizing Scientific Thinking Using the QuALMRI Framework Written by Kevin Ochsner and modified by others. Based on a scheme devised by
More informationChapter 3. Psychometric Properties
Chapter 3 Psychometric Properties Reliability The reliability of an assessment tool like the DECA-C is defined as, the consistency of scores obtained by the same person when reexamined with the same test
More informationReview and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy
Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy Final Proposals Read instructions carefully! Check Canvas for our comments on
More informationStudy Design. Svetlana Yampolskaya, Ph.D. Summer 2013
Study Design Svetlana Yampolskaya, Ph.D. Summer 2013 Study Design Single point in time Cross-Sectional Research Multiple points in time Study only exists in this point in time Longitudinal Research Study
More informationAssignment 4: True or Quasi-Experiment
Assignment 4: True or Quasi-Experiment Objectives: After completing this assignment, you will be able to Evaluate when you must use an experiment to answer a research question Develop hypotheses that can
More informationFORM C Dr. Sanocki, PSY 3204 EXAM 1 NAME
PSYCH STATS OLD EXAMS, provided for self-learning. LEARN HOW TO ANSWER the QUESTIONS; memorization of answers won t help. All answers are in the textbook or lecture. Instructors can provide some clarification
More informationAbstract Title Page Not included in page count. Title: Analyzing Empirical Evaluations of Non-experimental Methods in Field Settings
Abstract Title Page Not included in page count. Title: Analyzing Empirical Evaluations of Non-experimental Methods in Field Settings Authors and Affiliations: Peter M. Steiner, University of Wisconsin-Madison
More informationObjectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests
Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing
More informationbaseline comparisons in RCTs
Stefan L. K. Gruijters Maastricht University Introduction Checks on baseline differences in randomized controlled trials (RCTs) are often done using nullhypothesis significance tests (NHSTs). In a quick
More informationPower and Effect Size Measures: A Census of Articles Published from in the Journal of Speech, Language, and Hearing Research
Power and Effect Size Measures: A Census of Articles Published from 2009-2012 in the Journal of Speech, Language, and Hearing Research Manish K. Rami, PhD Associate Professor Communication Sciences and
More information15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA
15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA Statistics does all kinds of stuff to describe data Talk about baseball, other useful stuff We can calculate the probability.
More informationHPS301 Exam Notes- Contents
HPS301 Exam Notes- Contents Week 1 Research Design: What characterises different approaches 1 Experimental Design 1 Key Features 1 Criteria for establishing causality 2 Validity Internal Validity 2 Threats
More informationStatistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.
This guide contains a summary of the statistical terms and procedures. This guide can be used as a reference for course work and the dissertation process. However, it is recommended that you refer to statistical
More information