Guidelines: Evidence or Expert opinion or???? Objectives Need for guidelines? Prakesh Shah Professor, Department of Pediatrics, Mount Sinai Hospital Institute of Health Policy Management and Evaluation University of Toronto, Canada Method of production of guidelines Different recommendations and implications Agree II Help or Hinderance? Evidence-based clinical decisions Information proliferation How to distinguish sense from non-sense Busy health care workers, policy makers, purchasers, administrators think about reading 4 articles (one claiming benefit, one claiming harm, one showing no difference and the last one does not conclude) then taking decision on his/her population of interest 1
Hierarchy of evidence Systematic reviews, meta-analysis Critical Components RCT Observational comparison Cohort studies Case-Control studies Case report or case series Synthesizing evidence Development of guidelines Animal experiments/tissue based studies Synthesizing evidence Defining the Clinical Question Careful consideration of Patient-specific and disease-specific factors Risk factors Prognostic factors Prior morbidities Treatment characteristics Impact of these on the outcomes of interest Key component - Generalizability 2
Systematic Searching of the Literature Systematic and broad search strategy Transparency Reproducibility Critical Appraisal and Grading Methodologic quality Risk of bias assessment Conduct Conflict Ethics Developing guidelines P: In patients with acute hepatitis C I : Should anti-viral treatment be used C: Compared to no treatment O: To achieve viral clearance? Previous state Current state Evidence Recommendation Organization B Class I AASLD (2009) II-1 Should be initiated VA (2006) 1+ A SIGN (2006) -/- Most authorities AGA (2006) -/- B It works AWMF(2004) 3
Before GRADE Level of evidence - example AASLD AGA ACG ASGE Level of evidence Ia Ib II III Source of evidence Systematic reviews RCT Cohort studies Case-control studies Recommendations IV Case-series C V Expert opinion D A B A B C Multiple RCTs or metaanalysis Single randomized trial, or nonrandomized studies Only consensus opinion of experts, case studies, or standard-of-care Good Consistent, 1. Multiple published, well-designed, well conducted studies [ ] well-controlled (?) randomized trials or a well designed systemic (?) meta-analysis Fair Limited by the number, quality or consistency of individual studies [ ] Poor important flaws, gaps in chain of evidence 2. One quality-published (?) RCT, published welldesigned cohort/ casecontrol studies 3. Consensus of authoritative (?) expert opinions based on clinical evidence or from well designed, but uncontrolled or nonrand. clin. trials A. RCTs B. RCT with important limitations C. Observational studies D. Expert opinion Limitations of previous systems Guideline development Confuse quality of evidence with strength of recommendations Lack well-articulated conceptual framework Criteria not comprehensive or transparent Most steps in the grading process were implicit Focus on benefit and not all important outcomes 4
Grade down Grade up 21/01/2017 Why GRADE Vision Explicit and comprehensive criteria for downgrading and upgrading quality of evidence Clear separation between quality of evidence and strength of recommendations Transparent process Provides clear, pragmatic interpretations of strong versus weak recommendations Provides explicit evaluation of the importance of outcomes of alternative management strategies Globalize the evidence, localize recommendations Focus on questions that are important to patients and clinicians Undertake collaborative evidence reviews Use a common metric to assess the quality of evidence and strength of recommendations P I C O Systematic review Critical Critical Important Less Summary of findings & estimate of effect for each outcome High Moderate Low Very low RCT start high, obs. data start low 1. Risk of bias 2. Inconsistency 3. Indirectness 4. Imprecision 5. Publication bias 1. Large effect 2. Dose response 3. Confounders Grade components Quality of evidence Guideline development Formulate recommendations: For or against (direction) Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) Rate overall quality of evidence across outcomes based on lowest quality of critical outcomes We recommend using We suggest using We recommend against using We suggest against using Strength of recommendations 5
GRADE: Quality of evidence The extent to which one can be confident that an estimate of effect or association is correct Components determining quality RCTs start high Observational studies start low What lowers quality of evidence? 5 factors Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias Quality of evidence - Summary Conceptualizing quality Quality of evidence High Moderate Low Very low Study design Randomized trial Observational study Lower if Study limitations Inconsistency Indirectness Imprecision Publication bias Higher if Large effect (e.g., RR 0.5) Very large effect (e.g., RR 0.2) Evidence of dose-response gradient All plausible confounding would reduce a demonstrated effect High Moderate Low Very low We are very confident that the true effect lies close to that of the estimate of the effect. We are moderately confident in the estimate of effect: The true effect is likely to be close to the estimate of effect, but possibility to be substantially different. Our confidence in the effect is limited: The true effect may be substantially different from the estimate of the effect. We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect. 6
Grade down Grade up 21/01/2017 P I C O Systematic review Guideline development Critical Critical Important Less Formulate recommendations: For or against (direction) Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) Summary of findings & estimate of effect for each outcome High Moderate Low Very low RCT start high, obs. data start low 1. Risk of bias 2. Inconsistency 3. Indirectness 4. Imprecision 5. Publication bias 1. Large effect 2. Dose response 3. Confounders Rate overall quality of evidence across outcomes based on lowest quality of critical outcomes We recommend using We suggest using We recommend against using We suggest against using 25 Strength of recommendation The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects Factors that can weaken the strength of a recommendation Lower quality evidence Uncertainty about the balance of benefits versus harms and burdens Uncertainty or differences in patients values Uncertainty about whether the net benefits are worth the costs Determinants Explanation The higher the quality of evidence, the more likely is a strong recommendation. The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely is a weak recommendation warranted. The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted. The higher the costs of an intervention that is, the more resources consumed the less likely is a strong recommendation warranted. Implications of a strong recommendation Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action Policy makers: The recommendation can be adapted as a policy in most situations 7
Implications of a weak recommendation Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making Policy makers: There is a need for substantial debate and involvement of stakeholders Patient values & preferences In the absence of evidence, guideline panels have to function as surrogates to estimate values and preferences (V&P) Attaching V&P statements to guideline recommendations increases transparency Consumer involvement can help However, Conflicts of Interest Subjective and individualized May vary among patients and the guideline authors Patients values and preferences are usually unavailable Voluntarily verbal and/or written disclosure Recuse from voting if they have a potential conflict of interest. A methodologist on panel Objectivity in data analysis and ranking of evidence Preparation of evidence tables Facilitating consensus https://www.nhlbi.nih.gov/health-pro/guidelines/about 8
What GRADE isn t Not another risk of bias tool Limitations of Guidelines If inaccurate it can increase harm Author s own values and preferences, or COI Not a quantitative system (no scoring required) Not eliminate COI, but able to minimize Patient preferences and values are usually not captured Generalized recommendations for average patient Do not specifically address resources and costs Inconsistencies in study quality evaluation Lim W. et al Am Soc Hematology 2008; 26-30 Example: Cervical cerclage ACOG 2014 FEB RCOG 2011 MAY SOGC 2013 DEC History indicated cerclage Comparison ACOG RCOG SOGC 1/> secondtrimester pregnancy losses 3/> previous second trimester losses or PTB 3/> previous second trimester losses or PTB H/O spontaneous loss or PTB if the cervical length is 25mm before 24 w Transabdominal Previous failed transvaginal cerclage Not recommended for Müllerian anomalies or women who have undergone cervical surgery Not mentioned 9
Comparison Emergency cerclage candidates ACOG RCOG SOGC After ruling out CI in singleton gestation who have cervical changes Rescue suture should be individualized Removal 36-37w 36-37w 36-38w Cervix has dilated to <4 cm without contractions before 24 weeks GA In PPROM No firm recommendatio ns Removal can be considered after 48 hours for TPTL <34 weeks Advocates removal after 48 hours Agree II Example 23 items, 6 domains Scope and purpose Stakeholder involvement Rigor of development Clarity of presentation Applicability Editorial independence 2 global rating questions 10
AGREE II Still subjectivity is an issue pshah@mtsinai.on.ca 11
Risk of bias Judgment Inconsistency of results variation in size of effect overlap in confidence intervals statistical significance of heterogeneity I 2 or 2 Look for explanation for inconsistency patients, intervention, comparator, outcome, methods Heterogeneity Neurological or vascular complications or death within 30 days of endovascular treatment (stent, balloon angioplasty) vs. surgical carotid endarterectomy (CEA) 12
Indirect comparisons Interested in head-to-head comparison Drug A versus drug B but what if not studied? Differences in Indirectness of evidence Patients (early cirrhosis vs end-stage cirrhosis) Interventions (CRC screening: flex. sig. vs colonoscopy) Comparator (e.g., differences in dose) s (non-steroidal safety: ulcer on endoscopy vs symptomatic ulcer complications) Imprecision of results Any stroke (or death) within 30 days of endovascular treatment (stent, balloon angioplasty) vs. surgical carotid endarterectomy (CEA) All phase II and III licensing trials for antidepressant drugs between 1987 and 2004. 74 trials 23 were not published. Publication bias 13