Designing Experiments... Or how many times and ways can I screw that up?!?

Similar documents
Empirical Research Methods for Human-Computer Interaction. I. Scott MacKenzie Steven J. Castellucci

One slide on research question Literature review: structured; holes you will fill in Your research design

Experimental Research in HCI. Alma Leora Culén University of Oslo, Department of Informatics, Design

CHAPTER 8 EXPERIMENTAL DESIGN

Research Methods & Design Outline. Types of research design How to choose a research design Issues in research design

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

Topics. Experiment Terminology (Part 1)

EXPERIMENTAL METHODS IN PSYCHOLINGUISTIC RESEARCH SUMMER SEMESTER 2015

Previous Example. New. Tradition

VALIDITY OF QUANTITATIVE RESEARCH

Chapter 11. Experimental Design: One-Way Independent Samples Design

Psychology Research Process

15.301/310, Managerial Psychology Prof. Dan Ariely Recitation 8: T test and ANOVA

Experimental Methods. Anna Fahlgren, Phd Associate professor in Experimental Orthopaedics

Why do Psychologists Perform Research?

OVERVIEW OF RESEARCH METHODS I. Lecturer: Dr. Paul Narh Doku Contact: Department of Psychology, University of Ghana

Introduction to statistics Dr Alvin Vista, ACER Bangkok, 14-18, Sept. 2015

Conducting a Good Experiment I: Variables and Control

9 research designs likely for PSYC 2100

Inferential Statistics

Chapter 9 Experimental Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.

HPS301 Exam Notes- Contents

Research Questions, Variables, and Hypotheses: Part 2. Review. Hypotheses RCS /7/04. What are research questions? What are variables?

Psychology Research Process

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Experiment Design 9/17/2015. Mini summary of Green & Bavelier

Lecture 4: Research Approaches

EXPERIMENTAL RESEARCH DESIGNS

Chapter 13 Summary Experiments and Observational Studies

The essential focus of an experiment is to show that variance can be produced in a DV by manipulation of an IV.

Chapter 13. Experiments and Observational Studies. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Types of questions. You need to know. Short question. Short question. Measurement Scale: Ordinal Scale

Foundations of experimental research

SINGLE-CASE RESEARCH. Relevant History. Relevant History 1/9/2018

Research Process. the Research Loop. Now might be a good time to review the decisions made when conducting any research project.

Experimental Research. Types of Group Comparison Research. Types of Group Comparison Research. Stephen E. Brock, Ph.D.

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

Making a psychometric. Dr Benjamin Cowan- Lecture 9

OBSERVATION METHODS: EXPERIMENTS

Lesson 3 Experimental Design and Control of Variables

AP Psychology -- Chapter 02 Review Research Methods in Psychology

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Psychology 205, Revelle, Fall 2014 Research Methods in Psychology Mid-Term. Name:

Measures of Dispersion. Range. Variance. Standard deviation. Measures of Relationship. Range. Variance. Standard deviation.

Experimental / Quasi-Experimental Research Methods

REVIEW FOR THE PREVIOUS LECTURE

Political Science 15, Winter 2014 Final Review

Sheila Barron Statistics Outreach Center 2/8/2011

Study Guide for the Final Exam

Controlled Experiments

Overview of the Logic and Language of Psychology Research

Villarreal Rm. 170 Handout (4.3)/(4.4) - 1 Designing Experiments I

Sample Exam Questions Psychology 3201 Exam 1

Chapter 8. Learning Objectives 9/10/2012. Research Principles and Evidence Based Practice

The Scientific Method. Philosophy of Science. Philosophy of Science. Epistemology: the philosophy of knowledge

Human intuition is remarkably accurate and free from error.

2-Group Multivariate Research & Analyses

The Basics of Experimental Design [A Quick and Non-Technical Guide]

Higher Psychology RESEARCH REVISION

In this chapter we discuss validity issues for quantitative research and for qualitative research.

Experimental design (continued)

Validity and Reliability. PDF Created with deskpdf PDF Writer - Trial ::

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

Experimental Psychology

Comparing Two Means using SPSS (T-Test)

Selecting Research Participants. Conducting Experiments, Survey Construction and Data Collection. Practical Considerations of Research

Validity and Quantitative Research. What is Validity? What is Validity Cont. RCS /16/04

Chapter 5: Field experimental designs in agriculture

Kepler tried to record the paths of planets in the sky, Harvey to measure the flow of blood in the circulatory system, and chemists tried to produce

Patrick Breheny. January 28

PTHP 7101 Research 1 Chapter Assignments

Designs. February 17, 2010 Pedro Wolf

Audio: In this lecture we are going to address psychology as a science. Slide #2

CHAPTER NINE DATA ANALYSIS / EVALUATING QUALITY (VALIDITY) OF BETWEEN GROUP EXPERIMENTS

Research Landscape. Qualitative = Constructivist approach. Quantitative = Positivist/post-positivist approach Mixed methods = Pragmatist approach

Design of Experiments & Introduction to Research

Sample Size Considerations. Todd Alonzo, PhD

Evaluation: Scientific Studies. Title Text

PSY 216: Elementary Statistics Exam 4

FORM C Dr. Sanocki, PSY 3204 EXAM 1 NAME

Lecture 18: Controlled experiments. April 14

Who? What? What do you want to know? What scope of the product will you evaluate?

Experimental Design Part II

Single-Factor Experimental Designs. Chapter 8

One-Way ANOVAs t-test two statistically significant Type I error alpha null hypothesis dependant variable Independent variable three levels;

Choosing designs and subjects (Bordens & Abbott Chap. 4)

Educational Research. S.Shafiee. Expert PDF Trial

Module 4 Introduction

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

GCSE PSYCHOLOGY UNIT 2 FURTHER RESEARCH METHODS

Scientific Research. The Scientific Method. Scientific Explanation

Evaluating experiments (part 2) Andy J. Wills

STATS8: Introduction to Biostatistics. Overview. Babak Shahbaba Department of Statistics, UCI

Two-Way Independent ANOVA

Research Design. ESP 178 Applied Research Methods Calvin Thigpen 1/19/17

UNIT 3 & 4 PSYCHOLOGY RESEARCH METHODS TOOLKIT

Quantitative Evaluation

Biostatistics 3. Developed by Pfizer. March 2018

Chapter 9: Experiments

(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)

Transcription:

www.geo.uzh.ch/microsite/icacogvis/ Designing Experiments... Or how many times and ways can I screw that up?!? Amy L. Griffin AutoCarto 2012, Columbus, OH

Outline When do I need to run an experiment and what kinds of experiments are there? How does the way I design my experiment affect what results I get? How do I analyze my data? What do I need to report when I write up my results? 2

Methods of Gathering Data on Cognition and Perception Introspection Naturalistic observation Interviews / Surveys many others mentioned by Corné and Kristien Experiments 3

When do I need to run an experiment? Hypothesis Test or Experiment Advantage of experiments: Control of variables Disadvantage of experiments: Real- world work is often variable (uncontrolled) Independent Variable Dependent Variable Different user study goals: (not always mutually exclusive) 1) Improve the design of a geovisual tool 2) Understand perceptual and cognitive processes involved in using tools 4

Examples from real experiments Experiment I: Change detection experiment Experiment II: Hypothesis generation / tool use experiment Experiment III: Animated / static map visual cluster detection experiment 5

Experiments - Overview An independent variable (IV) is manipulated Single IV or multiple IVs (factorial design) A dependent variable(s) (DV) is measured Many basic experiments consist of two levels of the independent variable experimental group (e.g., new technique) control group (e.g., standard technique) Control over extraneous variables (sources of error) holding constant (e.g, lighting, handedness of participants, etc.) randomizing effects A causal relationship between the independent and dependent variables can be established 6

Choosing an experimental task 1. You want one that reflects the demands of a real world task. Otherwise there may be problems of generalizability. 2. But real world tasks are often complicated. So it can be hard to control extraneous variables, especially for complex tasks. 3. Think through the demands of the task to be sure that your experiment lets you say something about the use of your tool/map design/etc (task analysis). 7

Hegarty et al 2010 8

Operationalizing Variables (Experiment I) Sometimes variables can be operationalized in different ways: IV = Accuracy could be % correct, or could be % false alarms How you do this may lead to different conclusions from your experiment, so careful thought here is important. Image source: Kirk Goldsberry 9

Control Variables and Confounding Variables Control variables: 1. It s impossible to control all extraneous variables. 2. We don t want to anyway effects on external validity** (generalizability). Confounding variables: 1. Confounders = variables that vary systematically with an independent variable. 2. We really want to avoid these effects on internal validity**. 3. Example: labels as confounders. ** More about these in Amy Lobben s module. Image source: Wikipedia Woolfook, M., Castellan, W., and Brooks, C.I. (1983). Pepsi versus Coke: Labels, not tastes prevail. Psychological Reports, 52(1): 185-186. 10

IV Confounding variable DV Image source: WIkimedia Choropleth maps of absolute counts can confound area with the mapped phenomenon. 11

Potential problems reliability and validity (much more in Module 3) Reliability (consistency): If the true value = 2, each time you measure with a method, do you measure a value of 2? (test- retest) Reliability is a necessary but not sufficient condition for validity Validity (accuracy): Construct validity: Did I adequately measure the variable I am looking at? - If the true value = 2, did I come up with a measure of value = B? Internal validity: Can I infer that x caused y? (does faster pace = fewer correct answers?) External validity: Is what you are measuring generalizable to groups beyond the people you are measuring? 12

Factorial Design A factor = an independent variable A level = an subdivision of a factor Names refer to both the number of factors and the number of levels. Cued change Yes No Choropleth Map Type Prop. Circle Experiment I was a 2 x 2 design. 13

Basic Experimental Design Types Between subjects Two independent variables treatments e.g. two different types of maps (static vs animated) Participant randomly assigned to one condition or another Compares separate groups of individuals Within subjects Two independent variables treatments Participant receives both treatments, in sequence Compares treatments within one group of individuals 14

Between Subjects Advantages: Participants are not affected by practice on task Not affected by fatigue or boredom Disadvantages: Can be differences in other (uncontrolled) variables in the groups (individual differences) You need a large group of participants / more time Differences in environmental variables if groups are tested in different areas 15

Within Subjects Advantages: Controls for individual differences between participants. Reduced environmental effects. Disadvantages: Carryover Effects (Practice / Learning): - Participants may improve simply through the effect of practice on providing scores. - Longer experimental sessions: participants may become tired or bored and their performance may deteriorate. 16

Experiment II Characteristics Naturalistic Observation (Quasi- Experimental) Participants given real tools Participants given a realistic task Understanding the dynamics of hantavirus in a mouse population and its potential effects on human health. Generate hypotheses to explain the patterns seen in the data. 17

18

Research questions: 1. What do participants do with the system and how do they accomplish their goals? 2. What kinds of information do users attend to in the visual information display? 3. How is the information from the system used (cognitively)? 4. (If the segment is a hypothesis) What kinds of hypotheses are generated and do they differ throughout the process of model exploration? 19

The problem Users needed to be introduced to tools But If they all saw the map first, and then used the map more than other tools what does that mean? That the map is more useful than other tools? That the map is the one they remembered best? 20

The solution: counterbalancing Each group of expertise sees each order of tools. 18 participants, 3 groups, 3 tools = 3 blocks of 6. Subject 1 2 3 1(E) Map Scatter Time 2 (E) Map Time Scatter 3 (E) Scatter Map Time 4 (E) Scatter Time Map 5 (E) Time Map Scatter 6 (E) Time Scatter Map 7(G) Map Scatter Time 21

A more complex problem Experiment III (factorial design): 2 x 4 x 3 experiment Compares effectiveness of two conditions (animated / static) with a number of other independent variables (IV): gender (2), (between- subjects) pace (4), (within- subjects) coherence (3) (within- subjects) Potential for learning effects on the results of dependent variables (DV) - % correct and time - if not balanced properly!! 22

no cluster subtle cluster strong cluster 23

Complete counterbalancing Study II needed: 3 (groups) X 3! (number of conditions) So 3X3X2 = 18 subjects for counterbalancing For this study 2 (conditions) X 2! Gender X 4! Paces X 3! Coherences = 2x2x4x3x2x3x2 = 576 participants for full counterbalancing!!! 24

Clearly another solution is needed Balanced Latin Squares method Each stimulus level is preceded and followed equally often by each other stimulus: With six conditions: 2 sets of 12 stimuli 4 paces x 3 coherence 25

Ways we screwed it up 1. Did not program the Latin Squares matrix properly 2. Did not balance the two groups of stimuli and conditions properly 3. Did not balance gender properly (this only needed to rerun a few people). 26

Message for complicated designs Check and recheck and recheck before starting. Don t trust that everyone in the group did their part perfectly. Members of the group check each other s work. Create tables that help you check off who comes when (e.g., gender), etc. 27

Other potential problems not enough power (How many participants do I need?) Power of your experiment Can be estimated with programs such as GPower What does power depend upon? Alpha level. Sample size. Effect size: Association between DV and IV Separation of means relative to error. 28

Reality H o is true H o is false Decision Reject H o Wrong Type I Correct Fail to Correct Wrong reject H o Type II Image source: Bibby and Ferguson (2000) 29

Statistical Power Incorrectly failing to reject the null hypothesis: a type II error There really is an effect but we did not find it Statistical power: How well can I detect a real effect? Power can be described using the following equation: 1- β where β is the probability of making a type II error Power = the probability of not making a type II error 30

Power and alpha Relax alpha to increase power. (e.g. p < 0.05 instead of 0.01) This increases the chance of a Type I error. Not a great option. Image source: Bibby and Ferguson (2000) 31

Power and sample size Low N = little power. High N doesn t always produce an increase in power saturation point High N = high cost Goal: minimize N while maximizing power Image source: Bibby and Ferguson (2000) 32

Power and effect size As the differences in distributions increases the power also increases Image source: Bibby and Ferguson (2000) 33

Power and effect size As the variance about a mean decreases power also increases Image source: Bibby and Ferguson (2000) 34

Practicalities and Experiments Does my experiment have ethical concerns? Pilot testing Ethics Panels & Institutional Review Boards Your panel/board may require training before submission. Procedures vary by country and by institution, check your local regulations. Typically our work can get expedited approval because it s low risk. Budget sufficient time for this. How long will my experiment take? Should I set a priori criteria for eliminating participants? Is all of my equipment working as anticipated? 35

How do I analyze my data? Between- subjects: Two- way ANOVA Within- subjects: Two- way repeated measures ANOVA Both within- and between subjects: Mixed or Split- Plot ANOVA Main effect = effect of one variable holding others constant Interaction effect = effect of multiple variables together 36

What do I need to report when I write up my results? In addition to the normal sections of a paper: Participants How many, characteristics such as age, gender, experience, vision impairment, etc. Experimental design How many factors are being studied, what are the IVs, DVs, is it within- or between- subjects (or both)? It also includes how you measured your variables (e.g., questions asked, etc). This paper is worth a look: Olson, J.M. (2009). Issues in human subject testing in cartography and GIS. Proceedings of the 2009 International Cartographic Conference, http://icaci.org/files/documents/icc_proceedings/icc2009/html/nonref/17_11.pdf. 37

What do I need to report when I write up my results? Materials/Apparatus Materials: should describe what stimuli (e.g., maps, tools, etc) were used (and usually present examples), how the stimuli were created, design decisions made during their creation & rationale for those decisions Apparatus: if equipment such as an eye- tracker, colorimeter, iphone pointing device, etc. is used to capture data, technical details describing this equipment should be provided Procedure This description should contain enough detail about how stimuli were deployed/displayed, with what timings, in what order tasks were completed, etc, that someone else could replicate the procedure. For a good example to follow see the following paper: Hegarty, M., Canham, M.S., Fabrikant, S.I. (2010). Thinking About the Weather: How Display Salience and Knowledge Affect Performance in a Graphic Inference Task. Journal of Experimental Psychology, 36(1): 37-53. 38

What do I need to report when I write up my results? Reporting statistical results: Conventions: statistic (degrees of freedom) = statistic value, p < criterion e.g., t(18) = 4.7, p <.01 e.g., F(1, 58) = 26.73, p <.01 39

Archive of Experimental Stimuli: A proposal Within the next couple of years, the ICA Commission on Cognitive Visualization proposes to establish an archive of experimental stimuli from research that has been already published to do the following: 1. Help readers of papers more fully understand the design of experiments. 2. Provide researchers with examples of experimental materials that may help them in designing their own materials. 3. Stimulate collaborations and re- use of experimental materials, where appropriate. We are still thinking through the logistics of this, but watch for updates on our website: https://www.geo.uzh.ch/microsite/icacogvis/mission.html 40

Summary Experimental design can be quite complicated. The more factors you are trying to test, the more ways you can mess things up. Planning, testing, rechecking is very important. There are tradeoffs in every experimental design The key is finding the right ones for your situation and then executing that design properly. If you re not sure about decisions you need to make, consult the following sources: Experimental psychologists or statisticians at your institution Previously published work or textbooks on experimental design 41

Resources for Further Information Field, A., Hole, G. (2003). How to design and report experiments. Thousand Oaks: SAGE. Field, A. (2009). Discovering statistics using SPSS, 3 rd edition. Thousand Oaks: SAGE. Martin, D.W. (2008). Doing psychology experiments, 7 th Edition. Belmont, CA: Thomson Wadsworth. ICA CogVis website has links to several useful experimental research tools: https://www.geo.uzh.ch/microsite/icacogvis/resources.html 42