Validity Arguments for Alternate Assessment Systems

Similar documents
Judges Scoring Rubric and Judges Score Sheet

School of Nursing, University of British Columbia Vancouver, British Columbia, Canada

CRITICAL THINKING: ASSESSMENT RUBRIC. Preliminary Definitions:

TRANSLATING RESEARCH INTO PRACTICE

Perception-Based Evidence of Validity

Critical Thinking Rubric. 1. The student will demonstrate the ability to interpret information. Apprentice Level 5 6. graphics, questions, etc.

ConnSCU GENERAL EDUCATION ASSESSMENT RUBRIC COMPETENCY AREA: Written Communication

Critical Thinking Assessment at MCC. How are we doing?

SEMINAR ON SERVICE MARKETING

Lecture 9: Lab in Human Cognition. Todd M. Gureckis Department of Psychology New York University

ConnSCU GENERAL EDUCATION ASSESSMENT RUBRIC COMPETENCY AREA: (ORAL COMMUNICATION)

EER Assurance Criteria & Assertions IAASB Main Agenda (June 2018)

Insight Assessment Measuring Thinking Worldwide

THEORY DEVELOPMENT PROCESS

Exploring the relationship between validity and comparability in assessment

Consistency in REC Review

Uncertainty Characterization: The Role of Hypothesis-Based Weight of Evidence"

Science is a way of learning about the natural world by observing things, asking questions, proposing answers, and testing those answers.

Issues That Should Not Be Overlooked in the Dominance Versus Ideal Point Controversy

Chapter 02 Developing and Evaluating Theories of Behavior

Checklist for Text and Opinion. The Joanna Briggs Institute Critical Appraisal tools for use in JBI Systematic Reviews

Hypothesis-Driven Research

Introduction to the Scientific Method. Knowledge and Methods. Methods for gathering knowledge. method of obstinacy

What is good qualitative research?

A-2050 Ethics Analysis Essay. Squadron Officer School

PSY 560 Final Project Guidelines and Rubric

Critical Thinking and Reading Lecture 15

Writing Reaction Papers Using the QuALMRI Framework

By Gordon Welty Wright State University Dayton, OH 45435

Evaluation of Validity and Validation by Means of the Argument-based Approach

Chapter 1. Research : A way of thinking

FRASER RIVER COUNSELLING Practicum Performance Evaluation Form

Chapter 1. Research : A way of thinking

CONCEPTUAL FRAMEWORK, EPISTEMOLOGY, PARADIGM, &THEORETICAL FRAMEWORK

Methodological Issues in Measuring the Development of Character

CFSD 21 st Century Learning Rubric Skill: Critical & Creative Thinking

Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education.

PERSONALITY ASSESSMENT PROFICIENCY: REPORT REVIEW FORM

Is Leisure Theory Needed For Leisure Studies?

CHAPTER III RESEARCH METHODOLOGY

Nature and significance of the local problem

6. A theory that has been substantially verified is sometimes called a a. law. b. model.

SOCIAL AND CULTURAL ANTHROPOLOGY

III. WHAT ANSWERS DO YOU EXPECT?

Programme Specification

YEAR-END CLINICAL FEEDBACK. Viewed portions of sessions outside supervision

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

Validation of educational assessments: a primer for simulation and beyond

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews

INTRODUCTION TO SYMBOLIC INTERACTION: SYMBOLIC INTERACTION, PERSPECTIVES AND REFERENCE GROUPS LECTURE OUTLINE

A Framework for Conceptualizing, Representing, and Analyzing Distributed Interaction. Dan Suthers

Table of Contents. EHS EXERCISE 1: Risk Assessment: A Case Study of an Investigation of a Tuberculosis (TB) Outbreak in a Health Care Setting

Summary of Independent Validation of the IDI Conducted by ACS Ventures:

Reviewer s report. Version: 0 Date: 28 Sep Reviewer: Richard Thomas Oster. Reviewer's report:

It s the Organization! Clinical Ethics and Consultation at the Intersection of Institutional Practice

Managing Values in Science and Risk Assessment

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Personal Listening Profile Facilitator Report

Multiple Act criterion:

INTRODUCTION. Evidence standards for justifiable evidence claims, June 2016

This article, the last in a 4-part series on philosophical problems

Global&Journal&of&Community&Psychology&Practice&

Research Article 1. Swales CARS model (Create a Research Space) is the most often cited model for introductions. 3

The role of theory in construction management: a call for debate

Existential Therapy scores GOALS!

Comments on David Rosenthal s Consciousness, Content, and Metacognitive Judgments

Psi High Newsletter. October, Volume 38, No. 1. Does the term bullying in cyberbullying hinder prevention of cyber aggression?

Holt McDougal Avancemos!, Level correlated to the. Crosswalk Alignment of the National Standards for Learning Languages

An evidence rating scale for New Zealand

Energizing Behavior at Work: Expectancy Theory Revisited

Holt McDougal Avancemos!, Level correlated to the. Crosswalk Alignment of the National Standards for Learning Languages

Advanced Logical Thinking Skills (1) Revisit the concepts of logic

SJSU Annual Program Assessment Form Academic Year

Resources 1/21/2014. Module 3: Validity and Reliability Ensuring the Rigor of Your Assessment Study January, Schuh, chapter 5: Instrumentation

Advancing Use of Patient Preference Information as Scientific Evidence in Medical Product Evaluation Day 2

Linking Theoretical and Empirical Accounting Research Part I

Why do Psychologists Perform Research?

School of Social Work

Duke Law Journal Online

2014 Philosophy. National 5. Finalised Marking Instructions

Persuasive Speech. Persuasive Speaking: Reasoning with Your Audience

Against Securitism, the New Breed of Actualism in Consequentialist Thought

Cambridge International Examinations Cambridge International Advanced Subsidiary and Advanced Level. Published

UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4

RESEARCH ARTICLES. Brian E. Clauser, Polina Harik, and Melissa J. Margolis National Board of Medical Examiners

Systems Engineering Guide for Systems of Systems. Essentials. December 2010

Thinking Ahead. Advance Care Planning for People with Developmental Disabilities

W e l e a d

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

Language, Gender, Culture: Honor Code and His Politeness is her Powerlessness

Response to the ASA s statement on p-values: context, process, and purpose

Foundation Competencies CHILD WELFARE EPAS Core

Handout 5: Establishing the Validity of a Survey Instrument

CAN T WE ALL JUST GET ALONG?

Mini Lecture Week 14 VALUES ETHICS

Access to clinical trial information and the stockpiling of Tamiflu. Department of Health

Understanding Assignment Questions: Propositions and Assumptions

Implicit Information in Directionality of Verbal Probability Expressions

Assurance Cases for Model-based Development of Medical Devices. Anaheed Ayoub, BaekGyu Kim, Insup Lee, Oleg Sokolsky. Outline

Alternative Explanations for Changes in Similarity Judgments and MDS Structure

Transcription:

Validity Arguments for Alternate Assessment Systems Scott Marion, Center for Assessment Reidy Interactive Lecture Series Portsmouth, NH September 25-26, 26, 2008 Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 1

Acknowledgments I appreciate the opportunity to learn from the states, evaluators, expert panelists, and my colleagues participating in the NAAC-GSEG and the NH-EAG projects. Much of what I am presenting today is drawn from collaborative work with Marianne Perie and from many conversations with Brian Gong. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 2

Documenting Technical Quality Our interest in validity arguments for alternate assessment grew out of the need to provide defensible documentation of the technical quality of these assessments. We argue that the purpose of the technical documentation is to provide data to support or refute the validity of the inferences from the alternate assessments at both the student and program level. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 3

Challenges of Alternate Assessments Small, very heterogeneous group of students Creates difficulties for statistical analyses Flexibility in: Assessment targets Assessment events Administration All create psychometric challenges, especially for comparability Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 4

Is it Psychometrics or Social Justice? Initially, we focused on consequences yes they are part of validity! because the intended consequences were a major rationale for including all students in standards-based education. However, we realized even before writing the NH EAG proposal that we needed to more completely evaluate the technical quality of alternate assessments. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 5

Many Possibilities Elsewhere we (Marion & Perie, 2008) have presented examples of evaluation questions and potential studies within familiar (e.g., joint standards) and less familiar (e.g., Knowing What Students Know, Ryan, 2002) frameworks for structuring legitimate validity evaluations. Further, the work of the NHEAG and NAAC have demonstrated that many familiar forms of analyses are possible even if they require some different thinking. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 6

Why a Validity Argument Serves to organize studies Provides a framework for analysis and synthesis Uses a falsification orientation Forces critical evaluation of claims Basically requires the user, developer, and/or evaluator to search for all the reasons why the intended inferences are NOT supported In practice we cannot search for ALL the reasons, so we need to prioritize studies. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 7

Kane s s argument-based framework assumes that the proposed interpretations and uses will be explicitly stated as an argument, or network of inferences and supporting assumptions, leading from observations to the conclusions and decisions. Validation involves an appraisal of the coherence of this argument and of the plausibility of its inferences and assumptions (Kane, 2006, p. 17). Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 8

Two Types of Arguments An interpretative argument specifies the proposed interpretations and uses of test results by laying out the network of inferences and assumptions leading from the observed performances to the conclusions and decisions based on the performances The validity argument provides an evaluation of the interpretative argument (Kane, 2006) Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 9

The Interpretative Argument Essentially a mini-theory theory the interpretative argument provides a framework for interpretation and use of test scores Like theory, the interpretative argument guides the data collection and methods and most importantly, theories are falsifiable as we critically evaluate the evidence and arguments Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 10

More Simply Test validation is basically the process of offering assertions (propositions) about a test or a testing program and then collecting data and posing logical arguments to refute those assertions If the assertions cannot be refuted, we can say that they are tentatively supported (and that s s the best we can do!) A simple organizational scheme for the propositions What does the testing practice claim to do; What are the arguments for and against the intended aims of the test; and What does the test do in the system other than what it claims, for good or bad? (Shepard, 1993, p. 429) Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 11

Values and Consequences Evaluating a decision procedure requires an evaluation of values and consequences To evaluate a testing program as an instrument of policy [e.g., AA-AAS AAS under NCLB], it is necessary to evaluate its consequences (Kane, 2006, p.53) Therefore, values inherent in the testing program must be made explicit and the consequences of the decisions as a result of test scores must be evaluated. Yet one more authority in the long line of validity theorists (Cronbach, Messick, Shepard, Linn, Haertel, Moss) making it quite clear that consequences are an integral part of test validation Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 12

Getting Started Katherine Ryan (2002) and others have suggested that laying out a more general theory of action is a useful starting point for developing a more complete validity argument Marianne Perie, and I created the following EXAMPLE theory of action for an alternate assessment system Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 13

Teachers have the resources and supports necessary to administer a standards-aligned AA- AAS Teachers provide instruction aligned with the grade-level content standards and academic expectations assessed Students get greater exposure to grade-level academic curriculum The AA-AAS has been designed to allow students to demonstrate their knowledge and skills in relation to academic expectations The content assessed by the AA-AAS is appropriately rigorous and aligned with grade level content standards Students respond to tasks/prompts as intended AA-AAS scores appropriately reflect student knowledge and skills AA-AAS scores provide information that is useful for teachers in building and maintaining curriculum and instruction aligned with academic expectations Students achieve increasingly higher academic expectations Negative Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 consequences 14 are minimized

Propositions underlying a single claim AA-AAS scores appropriately reflect student knowledge and skills 2. Scorers apply the rubric accurately and consistently when scoring the AA- AAS 3. Scorers are trained effectively 1. The rubric captures appropriately the knowledge and skills valued and assessed 4. Construct irrelevant variance is minimized in the scoring Marion. NAAC/Center Validity for Arguments Assessment. for CCSSO AA-AAS. AAS. Assessment RILS 2008 Conference June 2008 08 15

One of the most effective challenges to interpretative arguments (or scientific theories) is to propose and substantiate an alternative argument that is more plausible With AA-AAS AAS we have to seriously consider and challenge ourselves with competing alternative explanations for test scores, for example higher scores on our state s s AA-AAS AAS reflects greater learning of the content frameworks OR higher scores on our state s s AA-AAS AAS reflects higher levels of student functioning OR higher scores on our state s s AA-AAS AAS reflect greater understanding by the teachers on how to gather evidence or administer the test Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 16

Evaluating the Validity Argument Haertel (1999) noted that the individual pieces of evidence do not make the assessment system valid or not, it is only by synthesizing this evidence in order to evaluate the interpretative argument can we judge the validity of the assessment program. But, it is hard to find rules in educational measurement to help us to this synthesis and evaluation. However, much can be learned and borrowed from program and policy evaluation. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 17

Synthesis & Evaluation Synthesizing all of this information to arrive at a judgment about the testing program is an intellectual challenge in part because we re working along continua of evidence and arguments. The interpretative argument is used to structure the evaluation. The propositions should be written in such a way so that we can judge whether the evidence supports or does not support this particular claim. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 18

Evaluating the Propositions We have to critically evaluate the evidence and logic that support or refute the specific propositions. In the context of states large assessment systems, we do not have the luxury of concluding, that s not working, let s start over. In cases when the findings refute the propositions, we need to look for ways to improve the system. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 19

Dynamic Evaluation In almost all cases when evaluating the validity of state assessment systems, the studies are completed over a long time span. We are rarely in the position of having all the evidence in front of us to make a conclusive judgment. Therefore, we must engage in an ongoing, dynamic evaluation as new evidence is produced. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 20

Back to Argument Basing the validity evaluation on a wellfounded argument enables us to structure our dynamic evaluation so that we can build a comprehensive case for or against the assessment system. It is possible and it has happened recently that the evidence suggests starting over Without the structure of an argument, we just have a bunch of studies and little guidance for how to weigh the different results. Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 21

For more information Scott Marion smarion@nciea.org www.nciea.org www.naacpartners.org Marion. Validity Arguments for AA-AAS. AAS. RILS 2008 22