Testing the measurement equivalence of electronically migrated patient-reported outcome (PRO) instruments: Challenges and Solutions

Similar documents
Migrating a PRO Instrument. Presented by: Serge Bodart (Biomedical Systems) and Alisandra Johnson (Bracket)

ISPOR European Congress Barcelona, Spain. November 14, 2018

Addressing Content Validity of PRO Measures: The Unique Case of Rare Diseases

Study Endpoint Considerations: Final PRO Guidance and Beyond

Recommendations on Evidence Needed to Support Measurement Equivalence between Electronic and

Intro to epro Part I

Donald L. Patrick PhD, MSPH, Laurie B. Burke RPh, MPH, Chad J. Gwaltney PhD, Nancy Kline Leidy PhD, Mona L. Martin RN, MPA, Lena Ring PhD

Patient Reported Outcomes (PROs) Tools for Measurement of Health Related Quality of Life

Task Force Background Donald Patrick. Appreciation to Reviewers. ISPOR PRO GRP Task Force Reports

Patient-Reported Outcomes (PROs) and the Food and Drug Administration Draft Guidance. Donald L. Patrick University of Washington

Advancing the Science of Clinical Trial Data Collection

Research Methods. Observational Methods. Correlation - Single Score. Basic Methods. Elaine Blakemore. Title goes here 1. Observational.

Achieving Operational Excellence in Prospective Observational Research

Reliability and Validity checks S-005

Measures. David Black, Ph.D. Pediatric and Developmental. Introduction to the Principles and Practice of Clinical Research

Considerations for requiring subjects to provide a response to electronic patient-reported outcome instruments

Estimating sample size for qualitative research in clinical outcome assessment research: one size does not fit all! IP23

NHC Webinar Series on Clinical Outcome Assessments. Patient-Reported Outcomes and Patient-Centered Outcomes: Is There a Difference?

Overview. Methods of assessment. Assessment of LUTS. Methods of self-report assessment. Why use self-report instruments 11/12/2015

Electronic Capture of PROs in NCI Clinical Trials

Overview. Methods of assessment. Assessment of LUTS. Patient Assessment and Bladder Diary 15/05/2017. How are LUTS and quality of life (QoL) assessed?

Step 3-- Operationally define the variables B. Operational definitions. 3. Evaluating operational definitions-validity

Questionnaire Translation

Observer OPTION 5 Manual

Resources 1/21/2014. Module 3: Validity and Reliability Ensuring the Rigor of Your Assessment Study January, Schuh, chapter 5: Instrumentation

02a: Test-Retest and Parallel Forms Reliability

Health and Quality of Life Outcomes BioMed Central

Model 1: Subject As Single Factor

Patient-Reported Outcomes to Support Medical Product Labeling Claims: FDA Perspective

RELIABILITY OF REAL TIME JUDGING SYSTEM

FDA s GREAT Workshop. Industry Perspective: Development Activities Towards Phase 3 Endpoints. September 19, 2012

Welcome and PRO Consortium Update

Lessons learned along the path to qualification of an IBS outcome measure*

Developing and Testing Survey Items

Developing a Pediatric COA Measurement Strategy A Case Study in Asthma

Exposure Review and Troubleshooting. David Valentiner & Simon Jencius Northern Illinois University ADAA 2014 Chicago, IL

Measurement is the process of observing and recording the observations. Two important issues:

Patient-Centered Endpoints in Oncology

Measuring impact. William Parienté UC Louvain J PAL Europe. povertyactionlab.org

Journal of Patient- Reported Outcomes. Colleen A. McHorney 1*, Mark E. Bensink 2, Laurie B. Burke 3, Vasily Belozeroff 2 and Chad Gwaltney 4,5

Closed Coding. Analyzing Qualitative Data VIS17. Melanie Tory

What Is Sufficient Evidence for the Reliability and Validity of Patient-Reported Outcome Measures?

Survey Research Centre. An Introduction to Survey Research

Case Study: Irritable Bowel Syndrome Working Group A Journey Through Time

CHAPTER - III METHODOLOGY

Methodological Considerations in Using Patient Reported Measures in Dialysis Clinics. Acknowledgements

Update on CDER s Drug Development Tool Qualification Program

Assessing Consumer Responses to RRP: Experience at PMI in Developing Fit-for-Purpose Self-Report Instruments

Child Outcomes Research Consortium. Recommendations for using outcome measures

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

95% 2.5% 2.5% +2SD 95% of data will 95% be within of data will 1.96 be within standard deviations 1.96 of sample mean

Sample Size Determination for Number of Patient Interviews When Developing a PRO Using Qualitative Research Based on the 2009 FDA PRO Guidance

Disclaimer. ISPOR 22 nd Annual International Meeting May 24, 2017 Boston, Massachusetts, USA

Reliability AND Validity. Fact checking your instrument

Background to the Task Force. Introduction

Pediatrics Milestones and Meaningful Assessment Translating the Pediatrics Milestones into Assessment Items for use in the Clinical Setting

Chapter 4B: Reliability Trial 2. Chapter 4B. Reliability. General issues. Inter-rater. Intra - rater. Test - Re-test

Validity and reliability of measurements

SUPPLEMENTARY DATA American Diabetes Association. Published online at

3. They complete the gaps of transcript with the correct verbs.

Downloaded from:

Patient engagement: The future for outcomes research and clinical trials?

Dementia Quality of Life (DEMQOL)

Interrater and Intrarater Reliability of the Assisting Hand Assessment

Introduction On Assessing Agreement With Continuous Measurement

Comparison of the Null Distributions of

Be Quiet? Evaluating Proactive and Reactive User Interface Assistants

Using Direct Behavior Ratings in a Middle School Setting

Thinking with the End in Mind: From COA Instrument to Endpoint

N Utilization of Nursing Research in Advanced Practice, Summer 2008

QUALITY OF LIFE ASSESSMENT KIT

Overview of Clinical Study Design Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine US National

Methodological Issues in Measuring the Development of Character

how good is the Instrument? Dr Dean McKenzie

Psychological testing

Comparing Vertical and Horizontal Scoring of Open-Ended Questionnaires

About Reading Scientific Studies

Validity and reliability of measurements

Cardiac Rehabilitation & Exercise Training in Congenital Heart Disease. Jidong Sung Division of Cardiology Sungkyunkwan University School of Medicine

Health authorities are asking for PRO assessment in dossiers From rejection to recognition of PRO

Advancing Use of Patient Preference Information as Scientific Evidence in Medical Product Evaluation Day 2

CHAPTER 2 CRITERION VALIDITY OF AN ATTENTION- DEFICIT/HYPERACTIVITY DISORDER (ADHD) SCREENING LIST FOR SCREENING ADHD IN OLDER ADULTS AGED YEARS

Point-of-service questionnaires can reliably assess patients experiences

Improving. Bracket. with. Alzheimer s Disease Clinical Research

Question development history for 2018 Census: Gender Identity

TITLE: Development of Pain Endpoint Models for Use in Prostate Cancer Clinical Trials and Drug Approval

MEASURING PATIENT AND OBSERVER-REPORTED OUTCOMES (PROS AND OBSROS) IN RARE DISEASE CLINICAL TRIALS - EMERGING GOOD PRACTICES TASK FORCE

Development of a self-reported Chronic Respiratory Questionnaire (CRQ-SR)

ADMS Sampling Technique and Survey Studies

Variables in Research. What We Will Cover in This Section. What Does Variable Mean? Any object or event that can take on more than one form or value.

Road Home Program: Center for Veterans and Their Families at Rush. Philip Held, Ph.D. Research Director

CONTENT ANALYSIS OF COGNITIVE BIAS: DEVELOPMENT OF A STANDARDIZED MEASURE Heather M. Hartman-Hall David A. F. Haaga

BOARD ON HUMAN-SYSTEMS INTEGRATION. The Role of Measurement in System and Human Performance Evaluation

Study Designs in Epidemiology

RARE DISEASE WORKSHOP SERIES Improving the Clinical Development Process. Disclaimer:

How to incorporate Patient- Reported Outcomes (PROs) in Cochrane reviews? Caroline Terwee, Donald Patrick, Gordon Guyatt, Riekie de Vet

Variables in Research. What We Will Cover in This Section. What Does Variable Mean?

Assessing Cultural Competency from the Patient s Perspective: The CAHPS Cultural Competency (CC) Item Set

Transcription:

Testing the measurement equivalence of electronically migrated patient-reported outcome (PRO) instruments: Challenges and Solutions Sonya Eremenco, Evidera, Inc., Bethesda, MD, USA Diane Wild, ICON, Oxford, UK Karl McEvoy, CRF Health, Hammersmith, UK Jason Lundy, Critical Path Institute, Tucson, AZ, USA

Overview Introduction Qualitative Assessment of Conceptual Equivalence Challenges Study design considerations Study design options Quantitative Assessment of Statistical Equivalence Challenges Study design considerations Study design options Diaries and Setting of Assessment Q&A

epro Consortium Background The epro Consortium was established on April 1, 2011 Mission: To advance the quality, practicality, and acceptability of electronic data capture (EDC) methods used in clinical trials for PRO endpoint assessment Provides a non-competitive, neutral environment for epro providers to collaborate on issues related to capturing patient data electronically

Introduction Assessing equivalence of migrated instruments is necessary Level of equivalence evidence is dependent on the extent that the changes or modifications are likely to have affected the subjects interpretation and responses to the items in the instrument. Lack of consensus around study design considerations for both qualitative and quantitative methods epro Consortium establishing good practices

Levels of Equivalence Evaluation: ISPOR epro Task Force Report Coons SJ, Gwaltney CJ, Hays RD, et al. Recommendations On Evidence Needed To Support Measurement Equivalence Between Electronic And Paper-Based Patient-Reported Outcome (PRO) Measures: ISPOR epro Good Research Practices Task Force Report. Value in Health 2009; 12(4):419-429. http://www.ispor.org/taskforces/eprotf.asp

Introduction: Qualitative Purpose of cognitive interviews Type of interview Exploration Confirmation Concept elicitation interview Cognitive interview for content validity Cognitive interview for migration equivalence x x In migration studies, cognitive interviews are not intended to revisit the content validity of the original instrument. x x

Qualitative Assessment of Conceptual Equivalence Challenges Study design considerations Study design options

Qualitative - Challenges Goal: confirm that the interpretation and response to the items has not changed due to the migration How to assess this using qualitative methods rather than statistical methods? Incorporate both cognitive interview and usability testing in the same interview session

Qualitative Study Design Sample size Considerations Smaller numbers than content validity because saturation is not as difficult to achieve Range: 5 to 20 Is 5 sufficient to have confidence in the results? 10 seems optimal to detect issues 20 may be necessary if subgroups are involved

Qualitative Study Design Considerations Conducting multiple rounds of interviews PRO: if sufficient number of participants, allows for identification of issues, revisions and then further testing in a later round Allows for multiple sites, geographic and other diversity CON: if changes are made based on a small sample to begin with, later groups may contradict those findings Multiple rounds more time consuming Option: conduct 1 round, but pause after first 5 interviews to look for issues, make changes if necessary and then continue with the interviews Sufficient sample size needed for multiple rounds

Qualitative Study Design Considerations Whether to have patients answer both versions of the instrument or only the electronic version Is the comparison between formats helpful or necessary? How to assess response if only one version is answered?

Study Design Option 1 Subjects complete instrument on both modes Determine if responses for any items differ between modes Interview focuses on those items individually to determine if different responses were random (could go either way) or systematic due to differences in meaning or interpretation between modes This approach is not quantitative, responses are not analyzed but discussed with the subject qualitatively to identify reason for difference If latter occurs with a substantial number, the migration of those items should be revisited If changes are made, additional cognitive interviews are recommended If further discrepancies are found, quantitative equivalence study should be considered

Study Design Option 2 Subjects complete instrument on both modes Ask subjects if they think their responses for any items differed between modes Interview focuses on those items individually to determine whether the subject feels that they responded differently because of the mode of administration

Study Design Option 3 Subjects complete new mode only Ask how they interpret what each item is asking them Repeat item in their own words (paraphrasing) Think-aloud Subject s interpretation is compared to an item definition document to see if there is concordance Assumes this document is available from instrument developer or easy to create More closely parallels cognitive interview during instrument development Responses are not examined

Study Design Option 4 Ask only about instructions and/or items that were modified during migration Enables a more focused investigation of potential impact of the changes Subjects asked to read both versions on the two modes and identify any perceived differences in the task or in the interpretation/ meaning of modified items.

Qualitative Study Design Combination of approaches is possible Option 1 with asking about all items Option 1 focusing only on items that have changed Tradeoffs Time to complete both versions in Options 1 and 2 Lack of assessment of response in Options 3 and 4 Difficulty identifying reasons for differences in Option 1

Quantitative Assessment of Statistical Equivalence Challenges Study design considerations Study design options

Quantitative - Challenges Goal: to ensure that migration to the electronic mode did not introduce systematic bias to the scores obtained from the instrument Study Design Challenges Sample size and statistical methods Time between administrations (for crossover designs) Setting and population for study completion

Quantitative Study design considerations Sample size and statistical methods Two-period, crossover design using ICC and test of mean differences n=50-60 suitable in most cases Alternative approaches use item response theory Differential Item Functioning (DIF) Requires large samples; n=300-500

Crossover Design Subject Sample (n=60) Randomization (stratified) epro administration (n=30) Paper administration (n=30) Crossover epro administration (2 days after paper administration) Paper administration (2 days after epro administration)

Study Interval In the equivalence literature, retest intervals range from minutes to weeks apart Shorter intervals Benefit: allow the subject to complete both instrument administrations within a single study visit Risk: Memory effect could bias results Longer intervals might be better suited for modes that don t require physical hardware deployment

Statistical Analysis ICC (3,1) as defined in Shrout & Fleiss (1979) Intraclass Correlations: Uses in Assessing Rater Reliability, Psychological Bulletin, 1979; 86 (2): 420-428. Shrout & Fleiss define six types of ICCs, others have been defined elsewhere Selecting the wrong ICC will impact results ICC threshold = to within-subjects ICC, or 0.70

Quantitative Study design considerations Time between administrations Memory effect from repeated administrations is a threat to the validity of the study In equivalence literature, retest intervals range from 1 minute to several months Interval needs to be long enough to wash-out any memory effect, but short enough to ensure the subject s condition has not changed

Quantitative Study design considerations Strategies for time between administrations 1. Same day completion Subjects complete two administrations within same visit minutes to hours apart Efficient, minimize loss to follow-up Use of distraction task can minimize memory effect

Quantitative Study design considerations Strategies for time between administrations 2. Multi-day/take home completion Two administrations completed at least 24-hours apart May reduce memory effect, requires more planning and follow-up

Experimental Study Designs Limitations of paper test-retest data Older data from different sample Don t have ICCs to compare nor mean differences from paired sample T-test Recommended to generate new source mode retest data Separate single retest arm, extend crossover to a 3 period design

Source Mode Retest Data Test-retest sample, (n=60) Three-Period Crossover Design (n=60) Randomization 1 st Administration - epro 1 st Administration - Paper 1 st Paper administration Crossover 2 nd Paper administration (2 days later) 2 nd Administration - epro 2 nd Administration - Paper Second Crossover 3 rd Administration - epro 3 rd Administration - Paper

Diaries and Setting of Assessment Diaries are instruments that are designed to be completed outside the clinic For both qualitative and quantitative studies, key question is whether it is necessary to take the diary home, outside of the artificial clinic setting, to collect data as part of the equivalence assessment process

Qualitative and Quantitative Diaries are often designed to be completed early in the morning or before bed, or episodically when events occur Interview setting is artificial, not done at the time the diary would normally be completed Therefore responses are more hypothetical in nature or based on recall, or thinking back to the previous event

Challenge of setting Patients completing questionnaires (paper or electronic) in a clinic setting are more likely to focus on the task at hand and one can ensure they complete all required questionnaires This is not possible outside of the clinic setting Paper and electronic may be completed very differently outside of the clinic setting resulting in an apparent lack of equivalence Training/technical issues? Expect low ICC

Challenge of setting How important is it to replicate the circumstance of use that patients will face in the clinical trial? Home doesn t just mean in the comfort of one s own home.. Activities of day to day living that cannot be reproduced in a clinical setting.

Clinic Setting White coat syndrome? Subjects nervous in the clinical setting? More relaxed in their home environment? No daily data Do you ask patients about a hypothetical event/symptom? Possible false sense of the equivalence of paper and electronic given the differences between home and clinic environment

Home Setting Having patients take the diary home may extend the timelines of the equivalence study Additional burden to patient if they need to provide multiple days worth of data? But this is closer to how it would be used in a trial Can we combat Parking lot syndrome with regard to paper as we can in the clinic?

Q&A

Thank you! Contact details for further questions: Sonya Eremenco: sonya.eremenco@evidera.com Diane Wild: diane.wild@oxfordoutcomes.com Jason Lundy: Jlundy@c-path.org Karl McEvoy: karl.mcevoy@crfhealth.com