Learning objectives. Examining the reliability of published research findings

Similar documents
Controlled Trials. Spyros Kitsiou, PhD

Evidence Informed Practice Online Learning Module Glossary

Evidence- and Value-based Solutions for Health Care Clinical Improvement Consults, Content Development, Training & Seminars, Tools

Systematic Reviews. Simon Gates 8 March 2007

Systematic Review & Course outline. Lecture (20%) Class discussion & tutorial (30%)

Systematic reviews and meta-analyses of observational studies (MOOSE): Checklist.

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

School of Dentistry. What is a systematic review?

Evidence Based Practice

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews

GRADE. Grading of Recommendations Assessment, Development and Evaluation. British Association of Dermatologists April 2014

Introduction to systematic reviews/metaanalysis

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

GRADE. Grading of Recommendations Assessment, Development and Evaluation. British Association of Dermatologists April 2018

The comparison or control group may be allocated a placebo intervention, an alternative real intervention or no intervention at all.

ISPOR Task Force Report: ITC & NMA Study Questionnaire

Live WebEx meeting agenda

The role of Randomized Controlled Trials

ARCHE Risk of Bias (ROB) Guidelines

Critical appraisal: Systematic Review & Meta-analysis

The role of meta-analysis in the evaluation of the effects of early nutrition on neurodevelopment

Evidence Based Medicine

Determinants of quality: Factors that lower or increase the quality of evidence

Meta-analyses: analyses:

Checklist for Randomized Controlled Trials. The Joanna Briggs Institute Critical Appraisal tools for use in JBI Systematic Reviews

Checklist for Randomized Controlled Trials. The Joanna Briggs Institute Critical Appraisal tools for use in JBI Systematic Reviews

Experimental Design. Terminology. Chusak Okascharoen, MD, PhD September 19 th, Experimental study Clinical trial Randomized controlled trial

RATING OF A RESEARCH PAPER. By: Neti Juniarti, S.Kp., M.Kes., MNurs

Systematic Reviews and Meta- Analysis in Kidney Transplantation

GLOSSARY OF GENERAL TERMS

CRITICAL APPRAISAL OF MEDICAL LITERATURE. Samuel Iff ISPM Bern

Guidelines for Writing and Reviewing an Informed Consent Manuscript From the Editors of Clinical Research in Practice: The Journal of Team Hippocrates

Delfini Evidence Tool Kit

Structural Approach to Bias in Meta-analyses

Guidance Document for Claims Based on Non-Inferiority Trials

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Instrument for the assessment of systematic reviews and meta-analysis

In this second module in the clinical trials series, we will focus on design considerations for Phase III clinical trials. Phase III clinical trials

Transparency and accuracy in reporting health research

Checklist for appraisal of study relevance (child sex offenses)

Critical Appraisal of a Meta-Analysis: Rosiglitazone and CV Death. Debra Moy Faculty of Pharmacy University of Toronto

Clinical Evidence: Asking the Question and Understanding the Answer. Keeping Up to Date. Keeping Up to Date

Critical Appraisal Istanbul 2011

How to Interpret a Clinical Trial Result

The RoB 2.0 tool (individually randomized, cross-over trials)

Purpose. Study Designs. Objectives. Observational Studies. Analytic Studies

Critical Thinking A tour through the science of neuroscience

Evidence-Based Medicine and Publication Bias Desmond Thompson Merck & Co.

Critical Appraisal Series

EQUATOR Network: promises and results of reporting guidelines

CRITICAL APPRAISAL AP DR JEMAIMA CHE HAMZAH MD (UKM) MS (OFTAL) UKM PHD (UK) DEPARTMENT OF OPHTHALMOLOGY UKM MEDICAL CENTRE

PROGRAMMA DELLA GIORNATA

Improving reporting for observational studies: STROBE statement

CHAMP: CHecklist for the Appraisal of Moderators and Predictors

GATE CAT Intervention RCT/Cohort Studies

Overview of Study Designs in Clinical Research

Module 5. The Epidemiological Basis of Randomised Controlled Trials. Landon Myer School of Public Health & Family Medicine, University of Cape Town

EBM: Therapy. Thunyarat Anothaisintawee, M.D., Ph.D. Department of Family Medicine, Ramathibodi Hospital, Mahidol University

Appraising the Literature Overview of Study Designs

Objectives. Information proliferation. Guidelines: Evidence or Expert opinion or???? 21/01/2017. Evidence-based clinical decisions

Online Supplementary Material

The QUOROM Statement: revised recommendations for improving the quality of reports of systematic reviews

Objectives. Distinguish between primary and secondary studies. Discuss ways to assess methodological quality. Review limitations of all studies.

The importance of good reporting of medical research. Doug Altman. Centre for Statistics in Medicine University of Oxford

SUPPLEMENTARY DATA. Supplementary Figure S1. Search terms*

Standard Methods for Quality Assessment of Evidence

CONSORT 2010 checklist of information to include when reporting a randomised trial*

Introduzione al metodo GRADE

Outline. What is Evidence-Based Practice? EVIDENCE-BASED PRACTICE. What EBP is Not:

1. Draft checklist for judging on quality of animal studies (Van der Worp et al., 2010)

Perspectives on Large Simple Trials

CONSORT 2010 Statement Annals Internal Medicine, 24 March History of CONSORT. CONSORT-Statement. Ji-Qian Fang. Inadequate reporting damages RCT

Are the likely benefits worth the potential harms and costs? From McMaster EBCP Workshop/Duke University Medical Center

Assessing risk of bias

Overview and Comparisons of Risk of Bias and Strength of Evidence Assessment Tools: Opportunities and Challenges of Application in Developing DRIs

QUASI-EXPERIMENTAL HEALTH SERVICE EVALUATION COMPASS 1 APRIL 2016

User s guide to the checklist of items assessing the quality of randomized controlled trials of nonpharmacological treatment

CHECK-LISTS AND Tools DR F. R E Z A E I DR E. G H A D E R I K U R D I S TA N U N I V E R S I T Y O F M E D I C A L S C I E N C E S

Systematic reviews: From evidence to recommendation. Marcel Dijkers, PhD, FACRM Icahn School of Medicine at Mount Sinai

Washington, DC, November 9, 2009 Institute of Medicine

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Research Synthesis and meta-analysis: themes. Graham A. Colditz, MD, DrPH Method Tuuli, MD, MPH

CRITICAL APPRAISAL WORKSHEET 1

Clinical problems and choice of study designs

Fixed Effect Combining

Cochrane Bone, Joint & Muscle Trauma Group How To Write A Protocol

Clinical research in AKI Timing of initiation of dialysis in AKI

Network Meta-Analyses of Antipsychotics for Schizophrenia: Clinical Implications

Copyright GRADE ING THE QUALITY OF EVIDENCE AND STRENGTH OF RECOMMENDATIONS NANCY SANTESSO, RD, PHD

Chapter 2 Methodological Approach to Assessing the Evidence

CRITICAL EVALUATION OF BIOMEDICAL LITERATURE

Confounding and Bias

Appendix 2 Quality assessment tools. Cochrane risk of bias tool for RCTs. Support for judgment

Implementing scientific evidence into clinical practice guidelines

Randomized Controlled Trial

Protocol Development: The Guiding Light of Any Clinical Study

Systematic Reviews and Meta-analysis

About Reading Scientific Studies

Evaluating the results of a Systematic Review/Meta- Analysis

Trials and Tribulations of Systematic Reviews and Meta-Analyses

Transcription:

Examining the reliability of published research findings Roger Chou, MD Associate Professor of Medicine Department of Medicine and Department of Medical Informatics and Clinical Epidemiology Scientific Director, the Oregon Evidence-based Practice Center Learning objectives Discuss sources of bias in research studies and how they affect results and conclusions Describe how to critically appraise research studies and determine when findings can be applied to clinical practice Learn when to trust and use systematic reviews and guidelines May 4, 2012 Estimated probabilities of true positive findings Power Pretest probability Bias Example Probability that the finding is true Mathematically modeled the likelihood that a positive research finding is actually true Based on Bayes theorem for updating probabilities The post-test probability of a true finding depends on The pre-test probability The statistical power of the study The extent of bias in the study Ioannidis JPA (2005) PLoS Med 2(8): e124. (0.20) High (0.80) (0.20) (0.20) High (0.80) Very low (1:1000) Discovery-oriented exploratory research with massive testing with limited bias (1:10) Moderate Adequately powered exploratory epidemiological study Moderate (1:5) Moderate (1:5) High Underpowered, poorly performed RCT Underpowered, well performed RCT High (1:1) Very low Adequately powered RCT with little bias 0.0015 0.20 0.17 0.23 0.85 Empiric evidence of misleading research findings When are positive research findings more likely to be true? Study of highly cited studies in high impact journals 45 of 49 studies claimed an intervention was effective 16% contradicted by subsequent studies 16% reported effects stronger than subsequent studies 44% were replicated 24% had no attempt at replication Likelihood of being contradicted or reporting stronger effects Nonrandomized studies: 5 of 6 RCTs: 9 of 39 (more likely if smaller samples) Ioannidis 2005;294:218-228 Predefined, focused hypothesis testing Replicates previous findings Larger sample sizes studied Larger effects reported Study designed and executed to minimize bias Need to critically appraise studies to determine risk of bias

Critical appraisal Two aspects of quality All evidence is not equal Bias = any process, effect, or error in the design or conduct of a study that systematically favors one outcome over others Studies with greater risk of bias are more likely to be misleading, typically exaggerate treatment effects Critical appraisal involves judging features of individual studies related to risk of bias and considering their place on a study design-based evidence hierarchy Internal validity = the extent to which the results of a study can be reliably attributed to the intervention under evaluation External validity = the extent to which the results of a study can be applied to other populations and settings Why patients may appear to improve with a therapy Hawthorne effect Natural history Regression to the mean Treatment typically sought when symptoms are worse Hawthorne and other non-specific effects Attention, caring, empathy Placebo effect Related to perception of treatment Specific or true effects of treatment An increase in worker productivity produced by the psychological stimulus of being singled out and made to feel important Simply being studied results in better outcomes First reported after a series of studies conducted at the Western Electrical Company s Hawthorne Works in Chicago in the 1920 s and 1930 s Every change in working condition resulted in increased productivity Improving or reducing lighting resulted in similar effects Uncontrolled studies Estimates of effectiveness from uncontrolled studies Often retrospective with limited data on baseline characteristics Difficult to identify an inception cohort of patients Typically not blinded Unable to distinguish specific effects of treatment from non-specific effects High potential for publication bias Mobilization and manipulation studies claim an 80% success rate. 80% of low back pain patients get immediate relief with epidural blocks. In the YMCA s exercise program, 80% improve. With microcurrent therapy 82% were pain free with 10 treatments. 70-80% of those carefully screened for radicular symptoms benefit from surgery. Deyo RA, Spine 1993;18:2153-2162

Therapies for LBP thought to be effective based on uncontrolled studies Typical hierarchy of evidence Bed rest Sacroiliac fusion Coccygectomy Prolotherapy Intradiscal steroid injection Etc. Well designed randomized controlled trial Well designed non-randomized controlled trial Well designed prospective cohort study Well designed retrospective cohort study Well designed case control study Before-after studies, case series, case reports, descriptive studies, basic science studies, etc. Types of bias Bias in RCTs Type of bias Selection bias Performance bias Detection bias Attrition bias Reporting bias Factors associated with increased risk of bias -Inadequate randomization -Inadequate allocation concealment -Group dissimilarity at baseline -Inadequate blinding of participants and study personnel/care providers -Differential use of co-interventions -Differential or suboptimal compliance across groups -Inadequate blinding of outcomes assessors -Differential timing of outcome assessments -High or differential loss to follow-up -Analysis not intention to treat -Selective reporting A Sample R B?? Bias in RCTs Randomization A Sample R B?? Successful randomization largely eliminates the problem of confounding Even unmeasured or unknown confounders will be equally distributed (given adequate randomization methods and enough patients) Requires two elements: Generation of a truly unpredictable allocation sequence Computer generated, random numbers table, coin toss Successful implementation of the random allocation sequence Successful randomization should result in similar baseline characteristics across groups

Randomization and studies of TENS Randomization and studies of MI Transcutaneous electrical nerve stimulation for post-operative pain Type of study Number of studies Number of studies showing a positive treatment effect Nonrandomized 19 17 Randomized 17 2 Ref: Carroll D et al. Br J Anaesth. 1996;77:798-803 Randomization method Randomized and concealed allocation Randomized, inadequately concealed allocation Proportion of studies with prognostic maldistribution Proportion of studies showing a difference in case-fatality 14% 9% 27% 24% Nonrandomized 58% 58% Ref: Chalmers T et al. NEJM 1983;309:1358-61 Randomization: Empiric evidence Group similarity at baseline Studies found that inadequate generation of randomization sequence can exaggerate effects by up to 50%, though estimates vary One study found that when limited to studies with adequate allocation concealment, inadequate randomization exaggerated effects by about 25% Seven studies found that inadequate allocation concealment exaggerated treatment effects an average of 18% (95% CI 5-30%) Given a large enough sample, successful randomization should result in equally distributed baseline characteristics Statistically significant differences suggest failure of randomization Due to chance, inadequate randomization/allocation concealment methods, or manipulation of treatment allocation Non-statistically significant differences on important baseline characteristics that are clinically relevant may also be important Blinding Blinding: Empiric evidence Blinding of participants and study personnel/ caregivers Double-dummy designs, sham procedures Reduces bias related to knowledge and expectations of treatments received Not always possible Blinding of outcomes assessors Reduces bias related to differential assessment of outcomes Almost always possible Studies show lack of blinding exaggerates estimates of treatment effects an average of 14% May be more important in studies that assess subjective outcomes One study found lack of blinding inflated estimates by 30% in trials with subjective outcomes, no effect in trials with objective outcomes

Loss to follow-up Intention-to-treat analysis Loss to follow-up introduces attrition bias when those lost to follow-up differ from those who remain in the study Importance depends on amount of loss to follow-up and degree of difference across groups (e.g., low in one group but high in the other) Similar bias can occur if subjects are not lost to follow-up but excluded from analysis due to missing data or other reasons Patients may not receive allocated treatments, adhere to assigned therapy, or complete the trial Intention-to-treat analysis means analyzing patients according to the group they were assigned to, regardless of which treatments they actually receive or complete Failure to perform ITT analysis introduces bias when patients who do not adhere to allocated treatments differ systematically from those who do Selective outcome reporting Publication bias: Positive and statistically significant findings are more likely to be reported than negative and non-statistically significant findings Studies should report all pre-specified outcomes 71% of outcomes with p<0.05 fully reported 50% of outcomes with p>0.05 fully reported 30-50% of primary outcomes changed from protocol to publication The preferential publication of studies based on the magnitude and direction of the findings If missing studies are systematically different from identified studies, then bias is introduced into the systematic review Published studies are more likely to report positive results, resulting in biased estimates Methods are available for assessing the presence and impact of publication bias Funnel plot Egger s test Trim & fill method Impact of publication bias Assumptions of methods for assessing for publication bias 29 Meta-analysis based on published and registered trials Combination chemotherapy vs. monotherapy in ovarian cancer Survival ratio (95% confidence interval) Published trials (p=0.004) Registered trials (p=0.17) 0.7 1.0 1.3 Large studies are likely to be published regardless of statistical significance Small studies are at greatest risk for being lost: studies reported largest effects are more likely to be published Effect of these assumptions: Expect bias to increase as sample size goes down Various methods for assessing publication bias are based on looking for small sample effects Other causes of small sample effects include differences in methodological quality, chance

Funnel Plot Funding source Scatter plot of sample size or other measure of variance on y-axis, effect size on x-axis Smaller studies show greater variability in effects, plot should appear like inverted funnel If smaller studies with larger effects are preferentially published, the base of the funnel will appear skewed Can be an indirect source of bias Many studies show that studies funded by industry tend to report more favorable results for the funder s products, potentially via: Selective publication of favorable studies Selective reporting of favorable outcomes Spinning of results to make them seem more favorable Designing studies to favor one intervention over another Comparing against a less effective intervention Selecting patients more likely to be responsive Excluding patients who may experience side effects Observational studies Limitations of matching Many types of observational studies Cohort and case-control studies often used to evaluate effects of medical interventions Have separate comparison groups Requires proper selection of controls Highly susceptible to confounding Require additional safeguards not needed in well-designed RCTs Confounding by indication is prominent in studies of treatment efficacy Residual confounding likely to be present even if techniques (matching, restriction, adjustment) used to minimize it Tend to have less control over interventions, co-interventions, adherence, assessment of outcomes, and follow-up Shape Round Round Source Tree Tree Edible? Yes Yes Size Handheld Handheld Weight ½lb ½lb Limitations of matching When should clinicians adopt research findings into practice? Apples Oranges Shape Round Round Source Tree Tree Edible? Yes Yes Size Handheld Handheld Weight ½lb ½lb Caution with initial, unreplicated findings Proteus effects tendency for initial publication of positive findings, followed by extreme contradictory results Publication bias Caution with findings from small studies Caution when reported effect sizes are small Critically appraise studies to assess reliability of findings Consider looking to a well-done systematic review or guideline that summarizes all of the available evidence

Evaluating a body of evidence: GRADE method Be prepared to re-think practices! Negative findings are often more likely to be true than positive findings, yet: Contradicted positive claims often persist for many years despite convincing negative evidence from RCT s 50% of articles on vitamin E still reported favorable association with CV risk 8 years after publication of a major contradictory RCT Beta-carotene and cancer, estrogen and Alzheimer s disease Failure or unwillingness to abandon established medical practices in the face of convincing negative evidence Intensive insulin therapy for ICU patients Discrepancy between early, small trials and later, larger trials Consistent results across higher-quality RCTs is critical CVA or death after CEA with patching vs. primary closure Reproducibility of results is a core principle of the scientific method Findings that are reproduced increase the pre-test probability of subsequent studies Magnesium for myocardial infarction 7 trials with 1300 patients showed benefit Subsequent large trial (ISIS-4) showed no benefit Several medium-sized trials of high quality seem necessary to render results trustworthy (Egger and Davey Smith BMJ 1995;310:752-4) Rerkasem K. Br J Surgery 2010;97:466. Example: Glucose control in SICU patients Finding What Works In Health Care Trial by van der Berghe et al showed decreased mortality with tight glucose control in SICU patients (NEJM 2001) N=1548 Adopted in many guidelines and settings Five subsequent trials showed no benefit on mortality (1 trial showed increased mortality) and increased risk of hypoglycemia The most reliable way to identify benefits and harms associated with various treatment options is a systematic review of comparative effectiveness research. The Institute of Medicine, 2011

Why do we need reviews and guidelines? Proliferation of medical literature Proliferation of medical literature Individual studies can be misleading, or provide conflicting findings Variable design and quality of primary studies Complex links between interventions and outcomes may not be captured in individual studies Individual studies may lack sufficient power to detect small but clinically important effects Helps build upon previous research What questions are unanswered? What research would answer these questions? More RCTs were run in the year 2000, than in the entire decade 1965-1975 Doubling time of biomedical science: ~19 years in 1991 ~20 months in 2001 MEDLINE added 683,136 new records in 2008: 13,137/week 1,866/day Traditional review articles Do not use systematic literature searches Do not describe how studies were chosen for inclusion Do not assess or use reproducible methods to assess the validity of included studies Evidence is often incomplete or influenced by biases or opinions of authors Frequently recommend treatments long after they have been shown to be useless, or even harmful The Systematic Review Systematic reviews In its ideal form, is a review that includes an explicit and detailed description of how it was conducted so that any interested reader would be able to replicate it Are systematic to remove bias in finding and reviewing the literature Use explicit methods for: Identifying all relevant studies Including or excluding studies Rating the validity of studies Synthesizing the evidence, either qualitatively or quantitatively ( meta-analysis ) (Jadad 1998)

Uses of systematic reviews Clinical practice guidelines Summarize the existing data Detect small but clinically meaningful treatment effects Evaluate consistency of results between studies Develop and test hypotheses to explain differential outcomes between studies Identify research gaps and areas of uncertainty For decision-makers, provide an objective framework for collecting and reviewing evidence to guide policy decisions Another form of pre-processed evidence that can help guide clinical practice Should be based on a systematic review of the evidence Go beyond systematic reviews by providing recommendations to guide clinical practice, not just describe the evidence Should have an explicit link between the evidence and the recommendations Conflicts of interest are critical as they can affect the judgments used to develop recommendations Institute of Medicine standards for guidelines issued in 2011 Conclusions Positive findings from research studies are often misleading Factors associated with a higher likelihood of falsely positive findings include small sample sizes, low pretest probability, and risk of bias Critical appraisal is vital! Reproducibility of findings is a core scientific principle! Don t ignore negative results! Consider the whole body of evidence before making judgments Systematic reviews and clinical practice guidelines can be useful for guiding clinical practice, but must also adhere to methodological standards to be trustworthy