Experimental Design. Dewayne E Perry ENS C Empirical Studies in Software Engineering Lecture 8

Similar documents
Underlying Theory & Basic Issues

26:010:557 / 26:620:557 Social Science Research Methods

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

Lecture 9 Internal Validity

11/24/2017. Do not imply a cause-and-effect relationship

CHAPTER ONE CORRELATION

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

STATISTICAL CONCLUSION VALIDITY

Experimental and Quasi-Experimental designs

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

investigate. educate. inform.

RESEARCH METHODS. A Process of Inquiry. tm HarperCollinsPublishers ANTHONY M. GRAZIANO MICHAEL L RAULIN

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

Previous Example. New. Tradition

HPS301 Exam Notes- Contents

Experimental Design and the struggle to control threats to validity

Variables Research involves trying to determine the relationship between two or more variables.

Chapter 14: More Powerful Statistical Methods

A. Indicate the best answer to each the following multiple-choice questions (20 points)

Design of Experiments & Introduction to Research

Research Methodology. Characteristics of Observations. Variables 10/18/2016. Week Most important know what is being observed.

Still important ideas

Business Statistics Probability

Psychology Research Process

VALIDITY OF QUANTITATIVE RESEARCH

Experimental Psychology

Psychology 205, Revelle, Fall 2014 Research Methods in Psychology Mid-Term. Name:

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Lecture 4: Research Approaches

The essential focus of an experiment is to show that variance can be produced in a DV by manipulation of an IV.

Experimental Research. Types of Group Comparison Research. Types of Group Comparison Research. Stephen E. Brock, Ph.D.

9 research designs likely for PSYC 2100

EXPERIMENTAL AND EX POST FACTO DESIGNS M O N A R A H I M I

Educational Research. S.Shafiee. Expert PDF Trial

Chapter 11. Experimental Design: One-Way Independent Samples Design

PSY 250. Designs 8/11/2015. Nonexperimental vs. Quasi- Experimental Strategies. Nonequivalent Between Group Designs

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

One slide on research question Literature review: structured; holes you will fill in Your research design

Psychology Research Process

Experimental Research I. Quiz/Review 7/6/2011

EXPERIMENTAL RESEARCH DESIGNS

2013/4/28. Experimental Research

Study Guide for the Final Exam

Regression Discontinuity Analysis

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Political Science 15, Winter 2014 Final Review

Still important ideas

Psych 1Chapter 2 Overview

The Regression-Discontinuity Design

Unit 1 Exploring and Understanding Data

Research Approaches Quantitative Approach. Research Methods vs Research Design

Chapter 9 Experimental Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.

STA630 Research Methods Solved MCQs By

SUPPLEMENTARY INFORMATION

Experimental Research in HCI. Alma Leora Culén University of Oslo, Department of Informatics, Design

Chapter 1: Explaining Behavior

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

RESEARCH METHODS. Winfred, research methods, ; rv ; rv

VARIABLES AND MEASUREMENT

A Brief Guide to Writing

RESEARCH METHODS. Winfred, research methods,

Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F

Study Design. Svetlana Yampolskaya, Ph.D. Summer 2013

STATISTICS AND RESEARCH DESIGN

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

On the purpose of testing:

Chapter 1 Introduction to Educational Research

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Readings: Textbook readings: OpenStax - Chapters 1 11 Online readings: Appendix D, E & F Plous Chapters 10, 11, 12 and 14

Collecting & Making Sense of

2 Critical thinking guidelines

Behaviorism: An essential survival tool for practitioners in autism

Chapter 1 Thinking Critically with Psychological Science

Psychology of Dysfunctional Behaviour RESEARCH METHODS

The following are questions that students had difficulty with on the first three exams.

Applied Statistical Analysis EDUC 6050 Week 4

CSC2130: Empirical Research Methods for Software Engineering

Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making effective decisions

Designing Psychology Experiments: Data Analysis and Presentation

PTHP 7101 Research 1 Chapter Assignments

Experimental Design Part II

Validity and Quantitative Research. What is Validity? What is Validity Cont. RCS /16/04

Experiments. ESP178 Research Methods Dillon Fitch 1/26/16. Adapted from lecture by Professor Susan Handy

Measurement. Reliability vs. Validity. Reliability vs. Validity

UNIT II: RESEARCH METHODS

Assignment 4: True or Quasi-Experiment

Research Approach & Design. Awatif Alam MBBS, Msc (Toronto),ABCM Professor Community Medicine Vice Provost Girls Section

Review and Wrap-up! ESP 178 Applied Research Methods Calvin Thigpen 3/14/17 Adapted from presentation by Prof. Susan Handy

2. Collecting agromyticin hormonal as a measure of aggression may not be a measure of aggression: (1 point)

Human intuition is remarkably accurate and free from error.

In this chapter we discuss validity issues for quantitative research and for qualitative research.

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

SINGLE-CASE RESEARCH. Relevant History. Relevant History 1/9/2018

CHAPTER VI RESEARCH METHODOLOGY

Goal: To become familiar with the methods that researchers use to investigate aspects of causation and methods of treatment

Transcription:

Experimental Design Dewayne E Perry ENS 623 Perry@ece.utexas.edu 1

Problems in Experimental Design 2

True Experimental Design Goal: uncover causal mechanisms Primary characteristic: random assignment to sampling units If not random, then only Quasi Experimental Without randomization, cannot rule out some systematic biases Types of designs Between subject designs: sampling units are subjected to one treatment each Within subjects designs: sampling units receive two or more treatments 3

The No Effect Hypotheses Special place to the test of the hypothesis that the treatment is entirely without effect Reason: in a randomized experiment, this test may be performed virtually without assumptions of any kind ie, relying only on random assignment Contribution of randomization is clearest when expressed in terms of the test of no effect Does not mean that such tests are of greater practical importance It sets randomized and non-randomized aspects in sharper contrast Whereas inferences in non-randomized experiments require assumptions 4

The No Effect Hypotheses To say a treatment has no effect is to say that that each unit would exhibit the same value of the response whether assigned to treatment or to control A change in response indicates the treatment has some effect Will discuss later various tests of the significance and the size of effects 5

Alternative Hypotheses As opposed to the null hypothesis, experimental (alternative) hypotheses take a stand Different treatments behave differently (two tailed prediction) Predict what direction the expected differences take (one tailed) 6

Hypotheses and Theory Theory is a large scale map, different areas represent general principles Hypotheses are like small sectional maps, focus on specific areas Conceptual similarities range from very explicit to very vague fall back on hidden assumptions, regulative principles Give directions to our observations Some hypotheses spring from experimental observations Don t know where they will go Others from theory Conceptual hypotheses Rely on previous studies, theories 7

Creating Hypotheses Defining terms For observations to have value, abstractions have to be concretized Two types of definitions Operational: x is defined in terms of test y under conditions z Theoretical: abstract constructs used At some point must be operational for the experiment Hypotheses are predictive statements about the expected outcome They call for a test and embed a conclusion Explicit statements are de rigueur When comparisons are predicted they have to be explicated 8

Stating the Hypothesis Concomitant variation: X is a direct function of Y Comparative: other things being equal. H0 and H1 (null and alternative) are mutually exclusive and exhaustive Usually a specific H0 and a general H1 Try to reject H0 9

Hypothesis Testing Errors 2 types of errors: I and II Type I rejecting the null hypothesis when it is true Greater psychologically important risk Think there is a relationship when there is not Waste time in blind alleys Type II accepting the null hypothesis when it is false Deny a relationship when there is one In effect, reject useful results 10

Variables Independent and dependent Dependent: effect in which the researcher is interested Independent: cause of the effect Any event or condition can be conceptualized as either an independent of dependent variable Concerned about the effects of X on Y Ie, the causal effects of one on the other Both in the labs and in the field 11

Independent Variables No single or standard way of classifying variables Useful categorization (not mutually exclusive): Biological Eg, affects of gender in mentoring developers Environmental Eg, schedule pressure and fault insertion Hereditary Eg, IQ effects on complexity Previous training and experience Eg, effects of first programming languages Maturity Eg, age and elegance of program structures 12

Independent Variables Manipulated variables If an experiment, one expects manipulation Intentional and systematic variation Naturally occurring variables Manipulated by real life experience Eg, desk versus meeting inspections Context: normal of exceptional conditions Static group variables Pre-existing groups with identified characteristics: Organismic variables: sex, age, weight, etc Status variables: education, occupation, marital status Attribute variables: diagnoses, personality traits, behaviors Cannot be manipulated but are selected to gain proper contrast groups 13

Independent Variables Analogous to experimental treatment When used as a dependent variable, may make inferences as to how the group acquired its characteristics Eg, overweight lowers self-esteem Lower self esteem causes overweightness Risk of causal inferences - Tempting but risky Dependent variable not an accurate descriptor At best an association, connection, relationship, correlation Example of weight/esteem experiments Case 1: high calorie diet -> check esteem Ethical problems Case 2: overweight + low calorie -> raise esteem Doesn t prove overweight, low esteem Case 3: overweight + success -> lower weight High esteem, low weight doesn t prove le/ow Must be careful about the logic 14

Independent Variables Unidirectional paths Eg, height and self-esteem cannot switch Fixed by logic of antecedents and consequences Multiple variables: income, age -> truancy, discipline Need 2x2 analyses Question: does income discriminate truancy and discipline problems One-way, non-causal enabling relationships Eg, income IQ -> income But not vice versa Two-way, sequential causation Eg, success-failure and self-confidence Eg, baseball players slumps, hitting streaks -> Causation established by experimentation Manipulation Using static variables is descriptive/relational But not experimental 15

Independent Variables Establishing levels of independent variables First decision: categorical or continuous Eg, age is continuous, sex is categorical If continuous data, whole range, dichotomous, or graduated Risky as information is lost May be theoretical reasons for categories If hypothesis is state in categorical terms, then should be consistent If a relationship, not appropriate to break into dichotomies or nominal categories if variable is continuous For theoretical or rational, not statistical reasons Examine how the levels of categories established Should be consistent with hypothesis Possible groupings: extremes, ranges of categories, median split (as in IQ) 16

Independent Variables Continuous full-range distribution Sometimes linear correlations, sometimes not Eg, learning (perhaps), visual acuity (not) Theory driven levels Hypothesis stated consistently with current theory Strength of independent variable (magnitude of effect) Extreme groupings tend to magnify effect Increasing magnitude may reduce generality Can more easily argue weaker to stronger (eg, stress) and have great generality Levels of independent variable should match hypothesis 17

Dependent Variables Many possible flavors, literally thousands Eg, learning new design techniques Direction of observed change Amount of change The ease with which change effected Persistence of changes over time 2 general classes Diffusion fan out Eg, technology insertion and adoption Hierarchical variations changes in ranking Eg, changing roles in organizational structures 18

Practical Application Distinguish two kinds of control groups No treatment Ok for physical effects Problems where belief may confound Placebo Rule out belief effects Practical decision is not easy which to use Question of greatest interest Experience or knowledge of the general area Easy to make mistakes in a new area 19

Design 1 One shot case study: X O H- M- I(NR) S- Basic Designs Deficient in terms of any reasonable controls History may be alternative explanation Maturation not controlled for May be changes in instruments or judges Unknown state of participants Instrumentation not a factor: no pre-measurement Design 2 One group pretest: O X O Slight improvement, but no comparison 20

Design 3 Solomon Design True experimental, 4 group I R O X O II R X O III R O O IV R O H+ M+ I+ S+ All well controlled for Basic Designs 21

Basic Designs Solomon provides elegant illustration of logic of control Pretest performance scores in I & III to estimate pretest scores in II & IV Requires a leap of faith even if scores the same Even if differ greatly, II & IV could be equal to the mean of I & III Use estimated pre-test scores to enrich factorial analysis of variance of post test scores Tells us if any confounding of pre-test and treatment 22

Basic Designs Pre/post test effects: I+ II- III+ IV- Experimental treatment effects: I+ II+ III- IV- Pretest & X sensitization: I+ II- III- IV- Pretest sensitization = Extraneous effects: I+ II+ III+ IV+ YI YIII ( ) ( Y Y ) II IV 23

Basic Designs External Validity All 3 suffer from the possible confounding of selection and treatment Design 3 - Solomon: controls for confounding of treatment and pre-test sensitization Design 4: I R O X O III R O O IV: H+ M+ I+ S+ Deficient in pre-test sensitization eg, problem in attitude change or learning experiments Design 5 II R X O IV R O IV: H+ M+ I+ S+ Avoids pretest sensitization issues 24

Basic Designs Within subjects designs Each subject receives all treatments in turn Useful in SWE/CS repeated measures design Advantages: Same number of subjects used more effectively Each sampling unit serves as its own control Can examine relationships longitudinally Difficulties: Sensitization problems Learning etc Order of treatments may produce differences in successive measures Another threat to IV in longitudinal studies Regression towards mean When linear relationship is imperfect Eg, overweight people appear to lose weight, low IQs appear to become brighter Observed when variables consists of the same measure taken at two points in time and the correlation r < 1 25

Basic Designs Solve threat by standard Z score A raw score from which the sample mean has been subtracted and the difference then divided by the standard deviation Regression equation: Z = r Y XY Z X The estimated score of Y is predicted from the XY correlation r times the standard score of X If there is a perfect correlation, the Z scores will be equivalent; otherwise not if r < 1 26