The Journey from Evidence to Guidelines to Measures to Comparative Effectiveness Vincenza Snow MD, FACP Director, Department of Clinical Programs and Quality of Care
Who We Are The US s largest medical specialty organization 129,000 members Internists Sub-specialists in Internal Medicine Residents and fellows in training Medical students Headquarters in Philadelphia and an office in Washington, D.C.
Background: ACP Clinical Guidelines Clinical Efficacy Assessment Subcommittee (CEAS) established in 1981 as a technology assessment activity Early guidelines covered screening tests, and laboratory and imaging testing Now cover many different internal medicine topics Currently have over 20 active guidelines
Two Products Clinical Guidelines Involves a systematic review of available evidence and a guideline statement with recommendations Evidence and recommendations are graded for strength and quality Clinical Guidance Statements Involves review of available guidelines and summary recommendations
How are the Topics Selected Prevalence Impact on mortality and morbidity Effective health care available Areas of uncertainty & evidence that current performance is deficient Cost Likelihood of availability of strong evidence Relevance to IM
Review for Clinical Guidance Statements Medline, NGC search, experts in the field Use the AGREE instrument to rate guidelines Summarize the guidelines and their recommendations We make a summary recommendation based on the other guidelines
Guideline Development Process Formulate questions for the evidence review Systematic Evidence review Background evidence-review paper Guideline paper (recommendations) CEAS meetings and conference calls CEAS Guideline Sub-Panel conference calls Internal review External review
Guideline Development Process Takes 18-24 months on average Long approval process Guideline and background paper are then submitted to the journal (independent peer-review) Shelf life of ACP Guidelines 5 years
How are recommendations formulated? Systematic evidence review to answer specific questions, ie does cancer screening lead to decreased mortality If possible meta-analysis should be done The evidence is assessed for its quality using accepted grading systems like GRADE, USPSTF, ACC etc Recommendations are formulated based on the results of the evidence review and rated for strength (GRADE, USPSTF, ACC etc)
How is it that different groups looking at the same body of evidence come to different conclusions?
Where are the differences? Are the recommendations formulated after the evidence is reviewed or are the recommendations somewhat formulated ahead of time and evidence is looked for that supports the recommendations? Are gaps in evidence filled by consensus and expert opinion? What are the evidence thresholds? How much extrapolation from ideal conditions and/or highly selected populations in clinical trials is allowed?
Hypothetical example of CT Colonography A prospective, multicenter, clinical trial found that in 1200 patients CT colonography (CTC) has a sensitivity of 93% and a specificity of 98% for polyps >10 mm It was not designed to show mortality benefits but a lot of early stage cancers were detected Guideline A recommends CTC for screening, Guideline B does not recommend CTC, and Guideline C recommends CTC in very specific populations and settings
Additional information to take into account Population included a significant number of high risk individuals 9 centers participated but the results were mostly driven by one center that had the largest number of patients and also had the most experienced operators Study used multi-detector CTs, not always available Unclear interval Radiation risks not accounted for
How did the groups come to their recommendations then? Guideline group A felt that the ability to detect polyps with high sensitivity and specificity was enough to recommend the procedure. They felt that the mortality benefit could be extrapolated from the benefit of positives going on to diagnostic/therapeutic colonoscopy They did not take into account possible harms
Reasons for the Group B recommendation against CTC Guideline group B felt that the evidence was insufficient to recommend screening in a general population since the population in the study was not a true screening population. CTC availability and operator expertise are not widespread and there is no proven mortality benefit. In addition there was no clear screening interval and harms were not taken into consideration.
Reasons for the Group C recommendation Guideline group C felt that the study supported screening in very specific populations such as the one in the study which included high risk people who refused colonoscopy and people with contraindications. The group also recommended that this be done only in centers that had appropriate CT s and trained operators The group however gave the recommendation a strength of weak
Now what? How do we take guideline recommendations and make them into evidence-based measures that are used for accountability, rewards or penalties, or quality improvement? How do we prioritize the different recommended options? How do we know that implementing these guideline recommendations actually lead to improved outcomes in real practice and real patients?
Another Example: Cholesterol Guidelines Reasonable groups looked at the same large body of evidence and came to reasonable but different conclusions Cholesterol guidelines in patients with DM NCEP target of <100 ACC/AHA target of <100 or even lower ACP if patient has an additional cardiac risk factor in addition to DM then use a moderate dose of a statin regardless of lipid levels
How guidelines can lead to the questions we need answered in CER Current practice is to generally to start a statin at a lower dose, do repeated testing and visits and titrate the dose until the patient is at target and is not having side effects. NOT what was done in the studies ACP recommends start a statin at a moderate dose regardless of lipid level (the doses used in the trails) and only test if having symptoms
How guidelines can lead to the questions we need answered in CER Performance measures are dominated by targets Clinicians and patients are spending a lot of time, effort, and money treating to targets But do we know which strategy really works better?
Example of a CER question Is it more efficacious to titrate statin therapy to target cholesterol levels or start on moderate doses of a statin and only test for possible side effects? Measured outcomes could be: Lipid levels Fatal and non-fatal MI and stroke Cost including visits, testing, etc QOL, adherence, patient preferences and satisfaction
Another example of a CER question coming from guidelines ACP 2008 Guidelines on pharmacological treatment of dementia Further research is needed to evaluate the effectiveness of pharmacologic therapy for dementia and to assess whether treatment affects outcomes, such as institutionalization. Evaluation of the appropriate duration of therapy and more head-to-head comparisons of agents are needed. Finally, assessment of the effectiveness of combination therapy is lacking.
CER Questions for this Guideline Which therapies for dementia lead to less institutionalization? Are there sub-groups of patients who benefit more from certain dementia therapies? Is there a way to predict those patients who will not respond well to therapy? Are combinations safe, effective and cost effective?
Summary Evidence-based guidelines can have conflicting recommendations Clinicians are measured according to these recommendations Sometimes practice is dominated by one set of recommendations or there are significant variations in practice due to differing sets of recommendations CER can help address the questions brought to light in guidelines and practice
Questions