Washington, DC, November 9, 2009 Institute of Medicine

Similar documents
Introduzione al metodo GRADE

Cochrane-GRADE Workshop

Objectives. Information proliferation. Guidelines: Evidence or Expert opinion or???? 21/01/2017. Evidence-based clinical decisions

DO CLINICIANS WANT RECOMMENDATIONS? A RANDOMIZED TRIAL. On behalf of Ignacio Neumann & the group of authors

GRADE. Grading of Recommendations Assessment, Development and Evaluation. British Association of Dermatologists April 2014

GRADE. Grading of Recommendations Assessment, Development and Evaluation. British Association of Dermatologists April 2018

Guideline Development At WHO

Evidence-based Laboratory Medicine: Finding and Assessing the Evidence

MINI SYMPOSIUM - EUMASS - UEMASS European Union of Medicine in Assurance and Social Security

Example: Treatment Question

Outcomes and GRADE Summary of Findings Tables: old and new

Copyright GRADE ING THE QUALITY OF EVIDENCE AND STRENGTH OF RECOMMENDATIONS NANCY SANTESSO, RD, PHD

EVIDENCE AND RECOMMENDATION GRADING IN GUIDELINES. A short history. Cluzeau Senior Advisor NICE International. G-I-N, Lisbon 2 November 2009

Evidence-based medicine and guidelines: development and implementation into practice

Preventive Medicine 2009: Understanding the US Preventive Services Task Force Guidelines. *George F. Sawaya, MD

CRITICAL APPRAISAL OF CLINICAL PRACTICE GUIDELINE (CPG)

U.S. Preventive Services Task Force Methods and Processes. Alex R. Kemper, MD, MPH, MS June 16, 2014

Grading the Evidence Developing the Typhoid Statement. Manitoba 10 th Annual Travel Conference April 26, 2012

What is sufficient evidence to inform combination HIV prevention programs. Stefan Baral

GRADE, Summary of Findings and ConQual Workshop

Oxford Centre for Evidence-based Medicine Levels of Evidence (May 2001) 38

Guideline Development at the American College of Physicians. American College of Physicians

ACIP Developing Vaccine Recommendations and Policy in the US

Recommendations on Screening for Lung Cancer 2016

Evaluating the Strength of Clinical Recommendations in the Medical Literature: GRADE, SORT, and AGREE

Canadian Journal of Anesthesia/Journal canadien d anesthésie. Evidence-Based Clinical Updates (EBCU s) in Anesthesia

HARM. Definition modified from the IHI definition of Harm by the QUEST Harm Workgroup

ACR OA Guideline Development Process Knee and Hip

RECOMMENDATIONS FOR GROWTH MONITORING, PREVENTION AND MANAGEMENT OF OVERWEIGHT AND OBESITY IN CHILDREN AND YOUTH IN PRIMARY HEALTH CARE 2015

Glossary of Practical Epidemiology Concepts

Guideline development in TB diagnostics. Karen R Steingart, MD, MPH McGill University, Montreal, July 2011

INTRODUCTION WHAT ARE PROFESSIONAL SOCIETIES AND OTHER ORGANIZATIONS DOING NOW?

The Ever Changing World of Sepsis Management. Laura Evans MD MSc Medical Director of Critical Care Bellevue Hospital

Are the likely benefits worth the potential harms and costs? From McMaster EBCP Workshop/Duke University Medical Center

Understanding How the U.S. Preventive Services Task Force Works USPSTF 101

Recommendations on Screening for High Blood Pressure in Canadian Adults 2012

Systematic reviews: From evidence to recommendation. Marcel Dijkers, PhD, FACRM Icahn School of Medicine at Mount Sinai

CTFPHC Working Group Members:

Standard Methods for Quality Assessment of Evidence

Outline. What is Evidence-Based Practice? EVIDENCE-BASED PRACTICE. What EBP is Not:

Understanding How the U.S. Preventive Services Task Force Works USPSTF 101

Evidence-based medicine and clinical practice

Screening for Prostate Cancer with the Prostate Specific Antigen (PSA) Test: Recommendations 2014

Clinical Practice Guideline for PTSD: An Overview of the Process and the Product Featuring APA Staff Psychologist Lynn Bufka, PhD

Overview and Comparisons of Risk of Bias and Strength of Evidence Assessment Tools: Opportunities and Challenges of Application in Developing DRIs

Recommendations on Screening for Cognitive Impairment in Older Adults 2015

Practice guidelines : overview of methodology with focus on GRADE

The prevalence of obesity in adults has

Request for Proposals

P & T Competition: How to Session. Reproduced with permission from Lynn Nishida, R.Ph.

HEPATITIS C WORKING GROUP

Determinants of quality: Factors that lower or increase the quality of evidence

Overview of Study Designs in Clinical Research

Less is more: Guidelines

GRADE Evidence Profiles on Long- and Rapid-Acting Insulin Analogues for the treatment of Diabetes Mellitus [DRAFT] October 2007

Фармакоэкономика. теория и практика. Pharmacoeconomics. theory and practice

Evidence-based Imaging: Critically Appraising Studies of Diagnostic Tests

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews

Appraising the Literature Overview of Study Designs

The Guide to Community Preventive Services. Systematic Use of Evidence to Address Public Health Questions

The Joanna Briggs Institute Reviewers Manual 2014

4.2a Composition of Enteral Nutrition: (Carbohydrate/fat): High fat/low CHO March 2013

Strategies of the US Advisory Committee on Immunization Practices (ACIP) in developing evidence-based recommendations

Reporting the effects of an intervention in EPOC reviews. Version: 21 June 2018 Cochrane Effective Practice and Organisation of Care Group

WHO/PSM/PAR/ WHO Rapid Advice Guidelines on pharmacological management of humans infected with avian influenza A (H5N1) virus

HICPAC Recommendation Categorization Update Workgroup: Public Comment Summary and Finalization

Role of evidence from observational studies in the process of health care decision making

Methodology for Lung Cancer Evidence Review and Guideline Development* ACCP Evidence-Based Clinical Practice Guidelines (2nd Edition)

CRITICAL APPRAISAL WORKSHEET 1

Mapping from SORT to GRADE. Brian S. Alper, MD, MSPH, FAAFP Editor-in-Chief, DynaMed October 31, 2013

perc deliberated upon: a pcodr systematic review other literature in the Clinical Guidance Report providing clinical context

Critical Appraisal of a Meta-Analysis: Rosiglitazone and CV Death. Debra Moy Faculty of Pharmacy University of Toronto

CHECK-LISTS AND Tools DR F. R E Z A E I DR E. G H A D E R I K U R D I S TA N U N I V E R S I T Y O F M E D I C A L S C I E N C E S

The prevalence of obesity in adults has

CONSORT 2010 checklist of information to include when reporting a randomised trial*

Grading Diagnostic Evidence-Based Statements and Recommendations

The evidence system of traditional Chinese medicine based on the Grades of Recommendations Assessment, Development and Evaluation framework

Annex 2. GRADE glossary and summary of evidence tables

Early Rehabilitation in the ICU: Do We Still Need Chest Physiotherapy?

'Summary of findings' tables in network meta-analysis (NMA)

Why is ILCOR moving to GRADE?

Evidence Based Medicine

Update on bronchiectasis guidelines. James Chalmers MD, PhD, FRCPE, FERS University of Dundee, UK

Chapter 1: Introduction and Methodology

Supplementary Online Content

E-Health Systems Quality and Reliability:

PGY1 Learning activities-ebcp Scripts

Recommendations on Screening for Colorectal Cancer 2016

JBI GRADE Research School Workshop. Presented by JBI Adelaide GRADE Centre Staff

Quality of Clinical Practice Guidelines

Guidelines for clinical management of severe influenza infection. Aeron Hurt

TITLE: Optimal Care of Chronic, Non-Healing, Lower Extremity Wounds: A Review of Clinical Evidence and Guidelines

Evidence based practice. Dr. Rehab Gwada

GLOSSARY OF GENERAL TERMS

EVIDENCE-BASED HEALTH CARE

Value Based Health Care in the UK: NICE, VBP and the Cost-effectiveness Threshold. Eldon Spackman, MA, PhD

Living Guidelines Model

View from the Technology Evaluation Center (TEC)

Grading Study Quality in Systematic Reviews

Clinical Guidelines. Annals of Internal Medicine. Annals of Internal Medicine

Transcription:

Holger Schünemann, MD, PhD Chair, Department of Clinical Epidemiology & Biostatistics Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada Washington, DC, November 9, 2009 Institute of Medicine

Disclosure Documents editor for the American Thoracic Society Member of several WHO committees Executive committee member of the ACCP antithrombotic guidelines Co-convener of a Cochrane Methods Group Member of ACP COPD Guideline Panel Co-chair of the GRADE Working Group

Content Approaches to appraising evidence and developing recommendations Canadian Task Force Oxford Center for Evidence Based Medicine USPSTF SORT-Family Medicine Specialty Societies AHA/ACC CDC SIGN not described in detail GRADE Quality of evidence Strength of recommendation

Appraising evidence and developing recommendations To guide healthcare decision making, a guideline (panel) should weight the desirable and undesirable consequences related to that decision for the relevant setting on the basis of the best available evidence and integrate values and preferences. Evidence = observations in the world

Quality of Evidence In the context of making recommendations The quality of evidence reflects the extent to which our confidence in an estimate of the effect is adequate to support a particular recommendation Evidence grading systems are frameworks to assess the degree of this confidence Guyatt et al., 2008

Desirable and undesirable consequences desirable effects lower mortality improvement in quality of life, fewer hospitalizations reduction in the burden of treatment reduced resource expenditure undesirable consequences deleterious impact on morbidity, mortality or quality of life (including burden) increased resource expenditure

The origin of evidence appraisal systems Canadian Task Force on the Periodic Health Examination, CMAJ, 1979

Hierarchy of evidence STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion BIAS

Everything should be made as simple as possible but not simpler. (Albert Einstein)

Simple hierarchies are too simplistic Concealment of randomization Blinding (who is blinded in a double blinded trial?) Confounding, effect modification & ext. validity Intention to treat analysis and its correct application Why trials stopped early for benefit overestimate treatment effects? P-values and confidence intervals

Hierarchy of evidence STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion BIAS Expert Opinion Schünemann & Bone, 2003

Grade of Recommendation A B C D Oxford Centre for Evidence Based Medicine Levels of Evidence and Grades of Recommendations- 23 November 1999. Level of Evidence Therapy/Prevention, Aetiology/Harm Prognosis Diagnosis Economic analysis 1a SR (with homogeneity) of RCTs SR (with homogeneity*) of inception cohort studies; or a CPG validated on a test set. 1b Individual RCT (with narrow Confidence Interval) Individual inception cohort study with > 80% follow-up SR (with homogeneity*) of Level 1 diagnostic studies; or a CPG validated on a test set. Independent blind comparison of an appropriate spectrum of consecutive patients, all of whom have undergone both the diagnostic test and the reference standard. SR (with homogeneity*) of Level 1 economic studies Analysis comparing all (critically-validated) alternative outcomes against appropriate cost measurement, and including a sensitivity analysis incorporating clinically sensible variations in important variables. 1c All or none All or none case-series Absolute SpPins and SnNouts Clearly as good or better, but cheaper. Clearly as bad or worse but more expensive. Clearly better or worse at the same cost. 2a SR (with homogeneity*) of cohort studies SR (with homogeneity*) of either retrospective cohort studies or untreated control groups in RCTs. 2b Individual cohort study (including low quality RCT; e.g., <80% follow-up) Retrospective cohort study or follow-up of untreated control patients in an RCT; or CPG not validated in a test set. 2c Outcomes Research Outcomes Research SR (with homogeneity*) of Level >2 diagnostic studies Any of: Independent blind or objective comparison; Study performed in a set of non-consecutive patients, or confined to a narrow spectrum of study individuals (or both) all of whom have undergone both the diagnostic test and the reference standard; A diagnostic CPG not validated in a test set. 3a SR (with homogeneity*) of case-control studies 3b Individual Case-Control Study Independent blind comparison of an appropriate spectrum, but the reference standard was not applied to all study patients 4 Case-series (and poor quality cohort and case-control studies) 5 Expert opinion without explicit critical appraisal, or based on physiology, bench research or first principles Case-series (and poor quality prognostic cohort studies) Expert opinion without explicit critical appraisal, or based on physiology, bench research or first principles Oxford Centre for Evidence-Based Medicine (Chris Ball, Dave Sackett, Bob Phillips, Brian Haynes, and Sharon Straus). Any of: Reference standard was unobjective, unblinded or not independent; Positive and negative tests were verified using separate reference standards; Study was performed in an inappropriate spectrum** of patients. Expert opinion without explicit critical appraisal, or based on physiology, bench research or first principles SR (with homogeneity*) of Level >2 economic studies Analysis comparing a limited number of alternative outcomes against appropriate cost measurement, and including a sensitivity analysis incorporating clinically sensible variations in important variables. Analysis without accurate cost measurement, but including a sensitivity analysis incorporating clinically sensible variations in important variables. Analysis with no sensitivity analysis Expert opinion without explicit critical appraisal, or based on economic theory

USPSTF - Grade Definitions After May 2007: Certainty Level of Certainty High Moderate Low Description The available evidence usually includes consistent results from well-designed, well-conducted studies in representative primary care populations. These studies assess the effects of the preventive service on health outcomes. This conclusion is therefore unlikely to be strongly affected by the results of future studies. The available evidence is sufficient to determine the effects ofthe preventive service on health outcomes, but confidence in the estimate is constrained by such factors as: The number, size, or quality of individual studies. Inconsistency of findings across individual studies. Limited generalizability of findings to routine primary care practice. Lack of coherence in the chain of evidence. As more information becomes available, the magnitude or direction of the observed effect could change, and this change may be large enough to alter the conclusion. The available evidence is insufficient to assess effects on health outcomes. Evidence is insufficient because of: The limited number or size of studies. Important flaws in study design or methods. Inconsistency of findings across individual studies. Gaps in the chain of evidence. Findings not generalizable to routine primary care practice. Lack of information on important health outcomes. More information may allow estimation of effects on health outcomes. The USPSTF defines certainty as "likelihood that the USPSTF assessment of the net benefit of a preventive service is correct."

Recommendations for prognosis Use prognostic information to determine baseline risk for healthcare decisions

16

17

Center for Disease Control and Prevention (CDC) Evidence of Effectiveness Execution - Good or Fair Design Suitability Greatest, Moderate, or Least Number of Studies Consistent Effect Sized Expert Opinion Strong Good Greatest At Least 2 Yes Sufficient Not Used Good Greatest or At Least 5 Yes Sufficient Not Used Moderate Good or Fair Greatest At Least 5 Yes Sufficient Not Used Meet Design, Execution, Number, and Consistency Criteria for Sufficient But Not Strong Evidence Large Not Used Sufficient Good Greatest 1 Not Sufficient Not Used Applicable Good or Greatest or At Least 3 Yes Sufficient Not Used Fair Moderate Good or Greatest, At Least 5 Yes Sufficient Not Used Fair Moderate, or Least Expert Opinion Varies Varies Varies Varies Sufficient Supports a Recommendation Insufficient A. Insufficient Designs or Execution B. Too Few Studies C. Inconsistent D. Small E. Not Used

Grades of Recommendation Assessment, Development and Evaluation Aim: to develop a common, transparent and sensible system for grading the quality of evidence and the strength of recommendations - Since 2000 - Guideline developers, methodologists & clinicians from around the world CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005, AJRCCM 2006, Chest 2006, BMJ 2008

GRADE Uptake World Health Organization Allergic Rhinitis in Asthma Guidelines (ARIA) American Thoracic Society American College of Physicians European Respiratory Society European Society of Thoracic Surgeons British Medical Journal Infectious Disease Society of America American College of Chest Physicians UpToDate National Institutes of Health and Clinical Excellence (NICE) Scottish Intercollegiate Guideline Network (SIGN) Cochrane Collaboration Infectious Disease Society of America Clinical Evidence Agency for Health Care Research and Quality (AHRQ) Partner of GIN Over 40 major organizations

The GRADE approach Clear separation of 2 issues: 1) 4 categories of quality of evidence: very low, low, moderate, or high quality? methodological quality of evidence likelihood of systematic deviation from truth by outcome 2) Recommendation: 2 grades weak/conditional or strong (for or against)? Quality of evidence only one factor *www.gradeworkinggroup.org

Determinants of quality RCTsstart high observational studies start low 5 factors that can lower quality 1. limitations of detailed design and execution 2. inconsistency 3. indirectness 4. publication bias 5. Imprecision 3 factors can increase quality 1. large magnitude of effect 2. all plausible confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed 3. dose-response gradient

GRADE - 2004

Evidence Profiles/Summaries

Determinants of quality RCTsstart high observational studies start low 5 factors that can lower quality 1. limitations of detailed design and execution 2. inconsistency 3. indirectness 4. publication bias 5. Imprecision 3 factors can increase quality 1. large magnitude of effect 2. all plausible confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed 3. dose-response gradient

Directness (generalizability, applicability)

Strength of recommendation The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.

Ebell et al, 2004

USPSTF - Grade Definitions After May 2007: Recommendations Grade Definition Suggestions for Practice A The USPSTF recommends the service. There is high Offer or provide this service. certainty that the net benefit is substantial. B C The USPSTF recommends the service. There is high certainty that the net benefit is moderate or there is moderate certainty that the net benefit is moderate to substantial. The USPSTF recommends against routinely providing the service. There may be considerations that support providing the service in an individual patient. There is at least moderate certainty that the net benefit is small. Offer or provide this service. Offer or provide this service only if other considerations support the offering or providing the service in an individual patient. D The USPSTF recommends against the service. There is moderate or high certainty that the service has no net benefit or that the harms outweigh the benefits. Discourage the use of this service. I The USPSTF concludes that the current evidence is Statement insufficient to assess the balance of benefits and harms of the service. Evidence is lacking, of poor quality, or conflicting, and the balance of benefits and harms cannot be determined. Read the clinical considerations section of USPSTF Recommendation Statement. If the service is offered, patients should understand the uncertainty about the balance of benefits and harms. * The USPSTF defines certainty as "likelihood that the USPSTF assessment of the net benefit of a preventive service is correct.

GRADE Determinants of the strength of a recommendation - Judgments

Avian Influenza judgments about recommendation Factors that can weaken the strength of a recommendation. Example: treatment of H5N1 patients with oseltamivir Lower quality evidence Decision Explanation Yes No The quality of evidence is very low Uncertainty about the balance of benefits versus harms and burdens Yes No Uncertainty or differences in values Yes No The benefits are uncertain because several important or critical outcomes where not measured. However, the potential benefit is very large despite potentially small relative risk reductions. All patients and care providers would accept treatment for H5N1 disease Uncertainty about whether the net benefits are worth the costs Yes No For treatment of sporadic patients the price is not high ($45). Frequent yes answers will increase the likelihood of a weak recommendation

Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence). Values and Preferences Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment. Schunemann et al., The Lancet ID, 2007

Other explanations Remarks:Despite the lack of controlled treatment data for H5N1, this is a strong recommendation, in part, because there is a lack of known effective alternative pharmacological interventions at this time. The panel voted on whether this recommendation should be strong or weak and there was one abstention and one dissenting vote (13 total).

Implications of a strong recommendation Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action Policy makers: The recommendation can be adapted as a policy in most situations

Implications of a weak (conditional) recommendation Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making Policy makers: There is a need for substantial debate and involvement of stakeholders

Should there be no assessment? There is a very good alternative to using the system to rate clinical guidelines: clinicians and organizations should use published guidelines while considering the clinical context, the credentials, and any conflicts of interest among the authors, as well as the expertise, experience, and education of the practitioner. Kavanagh, 2009

Should there be no appraisal? Users of recommendations want to know Quality of evidence, recommendation Guideline panels have more resources, time and expertise than individual practitioner Just present the evidence?

Summary: Limitations of older systems Lack well-articulated conceptual framework Criteria not comprehensive or transparent Focus on single outcomes New systems: explicit evaluation of the importance of all important outcomes Confuse quality of evidence with strength of recommendations Transparent criteria of moving from evidence to recommendations New systems strengths: Group of international guideline developers Explicit acknowledgment of values and preferences Clear, pragmatic interpretation of strong versus conditional/weak recommendations for clinicians, patients, and policy makers Useful for SR and HTA, as well as guidelines

Conclusions Evidence appraisal systems Should provide a systematic framework General structure similar to initial approach Detailed assessment criteria better described Overlap between sophisticated systems Required because research methods complex Expert opinion evidence: interpretation Detailed guidance for developers and users needed Strength of recommendations Balance of benefits/harms, values, resource use? Can evidence be insufficient?