Interface Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations

Similar documents
FOURTH EDITION. NorthStar ALIGNMENT WITH THE GLOBAL SCALE OF ENGLISH AND THE COMMON EUROPEAN FRAMEWORK OF REFERENCE

Skills (Students will do): Determine word meanings Use context clues Acknowledge the need to stop and look for context clues.

THE PROFESSIONAL BOARD FOR PSYCHOLOGY HEALTH PROFESSIONS COUNCIL OF SOUTH AFRICA TEST DEVELOPMENT / ADAPTATION PROPOSAL FORM

COLLEGE OF THE DESERT

Test-Taking Strategies and Task-based Assessment: The Case of Iranian EFL Learners

Smarter Balanced Interim Assessment Blocks Total Number of Items and hand scoring Requirements by Grade and Subject.

FOURTH EDITION. NorthStar ALIGNMENT WITH THE GLOBAL SCALE OF ENGLISH AND THE COMMON EUROPEAN FRAMEWORK OF REFERENCE

TABLE OF CONTENTS. Introduction

2017 English. Literary Study. Advanced Higher. Finalised Marking Instructions

Elements of Nonfiction

Relationships Between the High Impact Indicators and Other Indicators

GCSE (9-1) English Literature EXEMPLARS

On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA

Reading DesCartes: Reading Comprehension (Informational) Skills: Constructing Meaning

Global Harmonization Task Force SG3 Comments and Recommendations ISO/DIS 9001: 2000 and ISO/DIS 9000: 2000 And Revision of ISO and 13488

Psychological testing

Are the Least Frequently Chosen Distractors the Least Attractive?: The Case of a Four-Option Picture-Description Listening Test

Item Writing Guide for the National Board for Certification of Hospice and Palliative Nurses

Meets Requirements Exemplars for English for Academic Purposes. Level 4

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Appendix A: NAPLaN Reading Skills by Proficiency Band

CHAPTER III RESEARCH METHODOLOGY

The author uses phrases like [excerpted text] and [excerpted text]. How does the author s word choice impact the tone of the story?

Developing language writing convincingly (Example from undergraduate Cultural Studies)

Level Descriptor of Strands and Indicators 0 The work does not reach a standard outlined by the descriptors below. 1-2

Rating the construct reliably

The Satisfaction in the doctor-patient relationship: the communication assessment

Chapter 8. Self-Concept, Self-Esteem, and Exercise

Clinical Aromatherapy Diploma Holistic Massage Diploma

Internship Standardized Letter of Reference

Survey Research. We can learn a lot simply by asking people what we want to know... THE PREVALENCE OF SURVEYS IN COMMUNICATION RESEARCH

Title: Systematic review of lung function and COPD with peripheral blood DNA methylation in population based studies

Title:Decisions on statin therapy by patients' opinions about survival gains: Cross sectional survey of general practitioners.

REPORT ON EMOTIONAL INTELLIGENCE QUESTIONNAIRE: GENERAL

1a: Draw on knowledge of vocabulary to understand texts

Why Human-Centered Design Matters

Development. summary. Sam Sample. Emotional Intelligence Profile. Wednesday 5 April 2017 General Working Population (sample size 1634) Sam Sample

Exemplar for Internal Achievement Standard. English Level 1

Title: Ego Defense Mechanisms in Pakistani Medical Students: A cross sectional analysis

TABLE OF CONTENTS. Introduction

Supplemental materials for:

DAT Next Generation. FAQs

Facet5 Appendix 1.0 Translations

Free Will and Agency: A Scoping Review and Map

Revised 2016 GED Test Performance Level Descriptors: Level 1 (Below Passing: )

COLLEGE OF THE DESERT

The Hospital Anxiety and Depression Scale Guidance and Information

Patient Reported Outcomes (PROs) Tools for Measurement of Health Related Quality of Life

2016 TOEFL Standard Setting for Licensing Physical Therapists and Physical Therapist Assistants. TOEFL Standard Setting Panel

BarOn Emotional Quotient Inventory. Resource Report. John Morris. Name: ID: Admin. Date: December 15, 2010 (Online) 17 Minutes 22 Seconds

A New Approach to Examining Validity

NON-NEGOTIBLE EVALUATION CRITERIA

BSBLDR511 Develop and use emotional intelligence. Learning Guide

Kashan University of Medical Sciences Faculty of Medicine English Department Lesson Plan

A Case Study for Reaching Web Accessibility Guidelines for the Hearing-Impaired

1 st Quarter Pacing: 4X4 Block ( 4.5 weeks), A/B Block, and Traditional Standards of Learning Concepts Resources Pacing

Missouri Institute of Mental Health Asking the Questions: The First Step of the MOSBIRT Protocol Rita Adkins, MPA Seven Word Sentence:

Interpretation of the COMET Handbook (version 1.0) and its insight for developing core outcome sets in clinical trials of traditional Chinese medicine

Stage 2 Research Project B Assessment Type 2: Outcome Synthesis (S2)

Title: The Relationship between Locus of Control and Academic Level and Sex of Secondary School Students

Experimental Research in HCI. Alma Leora Culén University of Oslo, Department of Informatics, Design

ADMS Sampling Technique and Survey Studies

Power Benchmarks. Advanced Psychology

Since light travels faster than. sound, people appear bright until. you hear them speak

Cognitive Strategies and Eye Movements for Searching Hierarchical Displays

NEW ENGLAND COMMON ASSESSMENT PROGRAM

Running head: ARTICLE CRITIQUE 1

TABLE OF CONTENTS. Page. Level 5 exemplars. Paper 1. Question 1... Question 2... Question 3... Paper 2

The Comparison/Contrast Essay

Chapter 15 PSYCHOLOGICAL TESTS

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Reliability and Validity of a Task-based Writing Performance Assessment for Japanese Learners of English

Basis for Conclusions: ISA 500 (Redrafted), Audit Evidence

American Sign Language II Topic Outline Course Description and Philosophy

Interpreter Preparation (IPP) IPP 101 ASL/Non-IPP Majors. 4 Hours. Prerequisites: None. 4 hours weekly (3-1)

Genomics Research. May 31, Malvika Pillai

The pretest-posttest design and measurement of outward-bound-type program effects on personal development

Edexcel A-Level History; Source-Value Questions (Paper 3)

PubMed Tutorial Author: Gökhan Alpaslan DMD,Ph.D. e-vident

Interpreter Preparation (IPP) IPP 101 ASL/Non-IPP Majors. 4 Hours. Prerequisites: None. 4 hours weekly (3-1)

Selected Problems in Measuring Extrinsic Religious Values

Re: National Bioengineered Food Disclosure Standard; Proposed Rule; Request for Comments, 83 Fed. Reg (May 4, 2018), Docket No.

Informed Consent Review

Mark Scheme (Results) Summer Pearson Edexcel GCSE in Health and Social Care (5HS04) Unit 4:Health, Social Care and Early Years in Practice

GCE Religious Studies Unit A (RSS01) Religion and Ethics 1 June 2009 Examination Candidate Exemplar Work: Candidate A

COACH WORKPLACE REPORT. Jane Doe. Sample Report July 18, Copyright 2011 Multi-Health Systems Inc. All rights reserved.

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

City of Angels School Independent Study Los Angeles Unified School District Contemporary Composition Instructional Guide

Chapter III. Methodology of Research

CHAPTER 3 METHODOLOGY

Academic Program / Discipline Area (for General Education) or Co-Curricular Program Area:

N Utilization of Nursing Research in Advanced Practice, Summer 2008

TENT award report Evaluation of voice recognition software to produce automatic lecture transcripts and live subtitles during lectures

Method. NeuRA Biofeedback May 2016

PEER REVIEW HISTORY ARTICLE DETAILS

distractor generation in multiple-choice language tests

Title: A survey of attitudes toward clinical research among physicians at Kyoto University Hospital

1. Stating the purpose of the speech is the first step in creating a speech.

Administrative-Master Syllabus form approved June/2006 revised Nov Page 1 of 5

Smiley Faces: Scales Measurement for Children Assessment

Transcription:

Interface Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations

Outline - Questions of face validity - New approach - Context, participants and instruments - Results - Conclusions

Educational context: Post mortem? it is important to seem to be testing as well as to be actually doing it Test takers acceptance of the test: - contributes to the validity of it - source of motivation Lay opinion taken seriously?

New approach: Test takers are asked to Interface validity - give their opinion on the test (face validity) - give their opinion on the content (content validity)

Context and participants ECL International Language Examination System Level B2 Reading comprehension test Two tasks: sentence completion short answer Online questionnaire 903 answers within the first week (cc 50%)

The instrument Questionnaire of 17 items Four-point Likert scale (4: completely true 1: not true at all) 6 items on face validity: general statements concerning difficulty, layout, etc. 11 items on content validity: descriptors of the CEFR paraphrased Two negative items (halo effect)

The Questionnaire - Examples Face validity: 3. I had enough time to complete the tasks. Content validity Original CEFR descriptor: Can understand articles and reports concerned with contemporary problems in which the writers adopt particular stances or viewpoints. 9. I could understand the viewpoints of the writer. 16. It was difficult to understand the viewpoints of the writer.

Procedure Halo effect: analysing the parallel opposite items we found significant negative correlations (-0.630 /-0.670) Deleting responses with inconsistent response patterns 791 candidates responses were found valid and consistent

Results and analysis Descriptive statistics

Results and analysis Item correlations Expectation: significant, probably moderate correlations Descriptors tap into different aspects of B2 construct Actual results Strong, significant correlation (0.807) in one case: Though the text was long I was able to scan it quickly Though the text was complex I was able to scan it quickly

Results and analysis Actual results Moderate, significant correlations (0.405-0.654) I could quickly identify the content of the text I could understand the viewpoints of the writer I could understand the stance of the writer I could quickly identify the content of the text I could quickly identify the content of the text Though the text was complex I was able to scan it quickly Most consistent pattern of correlations in the case of item 8: I could quickly identify the content of the text

Results and analysis Actual results Low, sometimes not significant, occasionally negative correlations (<0.4) I could rarely find idioms in the text A broad active vocabulary was needed to complete the tasks The text was concerned with contemporary problems

Results and analysis Batch correlations Correlating face validity items with content validity items Significant, moderate correlation (0.536) found Indication of relationship between constructs?

Conclusions Using candidate feedback in content validation is potentially useful Further analyses of data in progress Checking for significant differences between sets of responses to different items Refinement of reworded descriptors needed Further research necessary Relationship between candidate performance and opinion

Thank you for your attention!