Risk of Thyroid Cancer Based on Thyroid Ultrasound Imaging Characteristics Diabetes Update and Advances in Endocrinology & Metabolism Vickie A Feldstein MD Rebecca Smith- Bindman MD Department of Radiology & Biomedical Imaging RSB: Director, Radiology Outcomes Research Lab Epidemiology and Biostatistics University of California, San Francisco Conflict of Interest None 1
Utilization of Diagnostic Imaging Imaging has increased dramatically over the last 20 years This is due to increased imaging by radiologists and diverse medical specialists Patterns of Imaging Smith- Bindman R et al. JAMA 2012 2
Ultrasound Imaging Utilization has doubled over last 15 years US volume is > CT, MRI and Nuclear Medicine combined Factors that Contribute to Increase in Imaging Improvement in technology Increased capacity due to proliferation of equipment Patient demand Physician demand Malpractice concerns Relatively few guidelines for imaging High profitability 3
Thyroid Ultrasound Use has increased dramatically Coincided with increased ownership of US machines by endocrinologists and surgeons This is a well- described area of overuse Choosing Wisely American Board of Internal Medicine Foundation supported project to reduce imaging, testing, treatment Five Things Physicians and Patients Should Question Large number (49) societies contribute to the campaign Each recommends decreasing testing in 5 specific clinical scenarios where there is over- use 33 societies include reducing 1 or more areas of imaging www.choosingwisely.org 4
Endocrine Society: American Association of Clinical Endocrinologists List of What Patients and Physicians Should Question Multiple daily self- glucose monitoring in adults with stable type 2 diabetes Routine measurement 1,25- dihydroxy vitamin D Total/free T3 when assessing T4 dose in hypothyroid patient Testosterone without biochemical evidence of deficiency Thyroid ultrasound after abnormal thyroid function tests Why is Thyroid US on Choosing Wisely List of Overused Tests Thyroid ultrasound is used to identify and characterize thyroid nodules US is not part of the evaluation of thyroid function tests Incidentally discovered thyroid nodules are common Overzealous use of ultrasound will frequently identify nodules, which are unrelated to the abnormal thyroid function, and may divert the clinical evaluation to assess the nodules, rather than the thyroid dysfunction. 5
Potential Harm of Too Much Thyroid Ultrasound Patient eventually diagnosed with cancer - missed on initial exam. false negative Patients never diagnosed with cancer - work- up prompted by US. false positive Patients with Thyroid Cancer Steep rise in the diagnosis of thyroid cancer No associated decline in mortality Thought to largely reflect increased detection Thought to largely reflect over- diagnosis This results in increased morbidity, costs, labeling, without improvement in patient- centered outcomes 6
Thyroid cancer vs. Thyroid cancer detection epidemic? From: Current Thyroid Cancer Trends in the United States, JAMA 2014 Patients without Thyroid Cancer US will identify large number of thyroid nodules Around 50% of the population has a nodule The number of benign:malignant nodules 50:1 It is important to use rational, evidence- based criteria to decide which nodules to biopsy otherwise there is a lot of attention focused on chasing benign common findings because of fear of missing a single cancer 7
Rational and Reasonable Criteria There is a very large reservoir of thyroid cancer If the goal is to diagnose every single cancer of the thyroid, the only way to do this is to biopsy everyone Suggestion: Create guidelines for imaging so that patients in whom biopsy is deferred will have a low risk (not zero risk) of thyroid cancer Levels of Evidence for the Value of Tests 8
What is the Best Evidence on Which to Base Clinical Interpretations/Decisions? An experienced MD s opinion (the expert) A case series (i.e. a number of examples) A consensus opinion e.g. society guidelines A large well- done observational study without bias Several well- done observational studies and a statistical summary of those results A randomized controlled trial Biases Important In Imaging Selection Bias: patients reported are not typical Ascertainment Bias: outcomes the truth are often obtained only in patients with suspected abnormalities, and therefore you don t learn about misses Over- diagnosis Bias: if you look for disease, you will find a large reservoir of cases that would never have been symptomatic and that otherwise would never have been known. it is easy to cure such cases 9
Selection Bias Patients you read about are not typical of those in whom you are applying the test It is easier to diagnose advanced disease - cancer, birth defects, infection, vascular dz You cannot conclude because you were able to diagnose advanced dz that you can detect early dz Ascertainment Bias What happened to the patients studied? Need to follow- up on patients with a finding and those without a finding If you only follow- up on those in whom you suspect a problem, you will overestimate the accuracy - a lot!!! Basically, if you don t make an effort to find your misses, you assume they don t occur This is incredibly important in the area of thyroid US 10
Guidelines on Thyroid US All of the existing studies have ascertainment bias The recommendations of all societies Society of Radiologists in Ultrasound American Thyroid Association American Association of Clinical Endocrinologists European Thyroid Association Associazione Medici Endocrinologi Korean Society of Neuro & Head & Neck Radiology are based on expert opinion combined with flawed studies, as there were no well done observational studies Over- Diagnosis Bias There is a spectrum of disease for every kind of pathology If you do a lot of testing for disease (as opposed to waiting until patients are symptomatic) you will find a lot more disease than you think exists Prostate cancer is a well known example of this, but thyroid cancer is just as common There is a huge amount of early disease and you cannot consider finding this disease to be inherently beneficial to patients if the disease was not going to hurt them 11
Thyroid Ultrasound Performed in large numbers of patients Thyroid nodules common Up to 50% of adult population Thyroid cancer, uncommon 1% of all cancer Symptomatic 1/10,000 patients per year Mostly indolent 5 year survival is 98% even without treatment Not clear how aggressive we should be to find 6 mm Interpreting Thyroid US Large number of studies published on accuracy of US All are plagued with selection bias / ascertainment bias All studies limited their analysis to FNA d nodules FNA decision based on size / worrisome features Nodules without worrisome features not studied Selection bias including known and symptomatic cancers Both will inflate the accuracy of US 12
Purpose To determine the sonographic features statistically associated with thyroid cancer Unique aspect of study: we included nodules subjected to FNAB/surgery AND those not initially biopsied - through follow- up with tumor registry Goal: to identify nodules with low risk of cancer so that FNAB can be deferred/avoided 13
Study Methods Retrospective case - cohort study 11,618 patients w/ thyroid US at UCSF between January 2000 March 2005 Cohort linked to California Cancer Registry Outcome is known with high degree of certainty for all patients Average time from US to surgery: 0.4 yrs (range 0-4.2) Mean follow- up period: 3.7 yrs (range 2-6.9) Selection of Ultrasound Exams to Review Cancer Patients: Thyroid cancer, no other cancer. Had a pre- operative US at UCSF (N=105. of these, 96 cases retrieved on PACS) Control Patients: No thyroid cancer diagnosis at least 2 years after ultrasound. No other cancer. Sample of controls were matched to Cancer Patients by age, gender and year of US exam (N=369) 14
Table 1: Histologic Findings of the Study Cancers Thyroid Ultrasound Review Earliest available study chosen, reviewed in PACS All sonograms reviewed by 2 experienced board- certified radiologists blinded to outcome US features of thyroid gland and individual nodules recorded for each patient Nodules included if mean diameter 5 mm or larger Findings for up to 4 nodules recorded for each patient 15
Large Number of Features Assessed Size Shape Margins Composition (proportion cystic/solid tissue, appearance) Echotexture, echogenicity Micro- calcifications Coarse calcifications Comet tail artifact Rim calcifications Halo / absence of halo Central and peripheral Doppler flow signal Extracapsular extension or lymph nodes not assessed Ultrasound- Pathology Correlation Some Cancer Patients had benign and malignant nodules Rad- Path- Surgical correlation completed for Cancer Patients who had surgery at UCSF Nodule considered cancer if US findings (location and size) matched pathology description This part of the characterization was not blinded and was completed after the ultrasound interpretation 16
Accuracy Statistics: which ones are important Sensitivity: tells you about how well the test performs in patients with the disease you are looking for Specificity: tells you about how the test performs in normals You need statistics that combines these measures PPV/NPV: if test is normal or abnormal, risk of disease Likelihood ratio: how well does the test discriminate between those affected and those not Accuracy Statistics PPV: highly sensitive to the prevalence of disease In case- control studies where the number of subjects is chosen by the researcher, this statistic is meaningless In a cohort study, where you know all subjects: useful Likelihood ratio is stable across different study designs 17
Likelihood Ratios Allows you to look at a finding and figure out how much it increases the risk of disease Stable across different prevalence of disease, so does not matter how many cases or controls are selected Don t need to know how common the diagnosis is Likelihood Ratios Combines Sensitivity (Sens) and Specificity (Spec) Positive LR: Positive test in those with Dz = Positive test in those w/o Dz Negative LR: Negative test in those with Dz = Negative test in those w/o Dz Sens 1- Spec 1- Sens_ Spec PLR: NLR: If the test is positive, the risk increases by x times If the test is negative, the risk decreases by y times 18
How to Interpret Likelihood Ratios Positive LR: The bigger the better the test at ruling in disease 1-5: Not very helpful 5-10: Moderately helpful > 10: Extremely helpful (risk increases 10 times) Negative LR: The smaller the better the test at ruling out disease 1: Absence of finding does not lower risk at all 0.5-1: Not terribly helpful 0.1-0.5 : Moderately helpful < 0.1 : Extremely helpful (risk decreases 90%) Likelihood Ratio of 1 = no association between finding & outcome Results : Patient analysis Characteristics of Patients Included in the Study Age Distribution < 40 years 41 60 years > 60 years Gender Female Male Cancer Patients N=96 (%) 41 (43%) 36 (37%) 19 (20%) 71 (74%) 25 (26%) Control Patients N= 369 (%) 164 (44%) 132 (36%) 76 (20%) 286 (78%) 83 (22%) 19
Thyroid Nodules Distribution of # of Nodules Among Cases and Controls Cancer Patients N=96 (%) Control Patients N=369 (%) # of nodules 0 nodule 1 nodule 2 nodules 3 nodules 4 or more nodules Total 3 (3%) 43 (45%) 21 (22%) 12 (13%) 17 (17%) 189 161 (44%) 83 (23%) 63 (17%) 29 (8%) 33 (9%) 428 Thyroid nodules were COMMON. Found in 97% of patients diagnosed with cancer and in 56% of controls. Nodule Level Analysis Cancer Nodule Benign Nodule Cancers 102 87 Controls 0 428 Total 102 515 20
Accuracy of Individual US Characteristics (Univariate) Sensitivity 102 N (%) False Positives 518 N (%) Likelihood Ratio Odds Ratio Microcalcifications 39 (38%) 28 (5%) 7.0 11.6 Echotexture Hypoechoic to strap muscle Iso- Hyperechoic to strap 16 (16%) 51 (50%) 34 (6%) 198 (38%) 2.4 1.3 2.9 1.8 Shape Taller than wide 18 (18%) 42 (8%) 2.2 2.3 Composition Solid Mixed Cystic 68 (67%) 34 (33%) 0 (0%) 220 (43%) 248 (48%) 37 (7%) 1.6 0.7 0.034 2.2 1 0 Accuracy of Individual US Characteristics (Univariate) Sensitivity False Positives Likelihood Ratio Odds Ratio Nodule Size < 1 cm 1 2 cm > 2 cm N (%) 30 (29%) 38 (37%) 34 (33%) N (%) 248 (48%) 169 (33%) 97 (19%) 0.6 1.1 1.8 Central flow 40 (39%) 136 (26%) 1.5 1.6 Coarse Calcifications 13 (13%) 34 (7%) 1.9 2.1 1 1.9 3.1 Margins Ill- defined / lobulated 61 (60%) 212 (41%) 1.5 2.0 21
Features Predictive of Malignancy, Univariate Analysis Existing Literature UCSF Study Microcalcifications ++ Hypoechogenicity ++ Taller than wide shape ++ Solid composition ++ Nodule size, > 2 cm ++ Central Flow + Coarse calcifications + Ill- defined margins + Peripheral vascular flow - Halo, absence - Comet- tail artifact, absence - Rim calcifications - Features Predictive of Malignancy, Univariate Analysis Taller than wide shape Microcalcifications Hypoechogenicity Solid composition 22
Benign Thyroid Nodules Multivariate Results: Only Three Variables Remained Significantly Associated with Thyroid Cancer Odds Ratio Microcalcifications 8.1 Nodule size, > 2 cm 3.6 Solid composition 4.0 23
Thyroid Nodules Microcalcifications had the strongest association with cancer - seen in 38% of cancers and 5% of benign nodules 2 cm Papillary Thyroid Cancer 24
Combining Significant Characteristics Accuracy of Several Definitions of Abnormal US Result Sensitivity False Positive Likelihood Ratio Risk of cancer # Needed to Biopsy 1/3 Findings 88% 44% 2.0 2% 56 Solid 77% 32% 2.4 2% 48 Size > 2 cm 39% 21% 1.9 2% 59 Microcalcifications 39% 4% 9.7 8% 12 2/3 Findings 52% 7% 7.1 6% 16 3/3 Findings 7% 0% 28 100% 1 Microcalcifications or Solid AND > 2 cm 54% 8% 6.7 6% 17 Risk of Thyroid Cancer Based on Appearance of Thyroid and Characteristic of Any Nodules Identified Number of Cancers per 1000 patients Lowest Risk Very Low Low Moderate Homogenous Gland 0.6 0/3 features 2 No nodule with 2+ features 5 Thyroid has nodule with 1/3 18 Nodule with 2+ features 62 Microcalcifications 82 Microcalc ns or Solid and > 2 cm 58 Very High Nodule with 3/3 features 960 25
Sample Comparison Guidelines Most existing society guidelines are complex and requite discretion on part of performing MD Example: Society of Radiologist in Ultrasound guidelines: Nodules > 10 mm if microcalcifications Nodules > 15 mm if solid or predominantly solid or if coarse calcifications Nodules > 20 mm if mixed solid/cystic or cystic with mural nodule Frates et al. Management of thyroid nodules detected at US: Society of Radiologists in Ultrasound Consensus Conference statement Radiology 2005, 237:794 NO mention of nodules < 1 cm Frates et al. Management of thyroid nodules detected at US: Society of Radiologists in Ultrasound Consensus Conference statement Radiology 2005, 237:794 26
Size of Nodule (mm) 5-10 10-15 15-20 > 20 ATA* (FNA if...) Clinical risk factors & suspicious US features Nodule w/ microcalcifications or solid Nodule w/ microcalcifications, solid, or both solid/cystic w/ suspicious features All nodules except purely cystic ones SRU (Strongly consider FNA if...) No recommendations Nodule w/ microcalcifications Nodule w/ microcalcifications or solid w/ coarse calcifications Nodule w/ microcalcifications, solid, coarse calcifications, solid/cystic or cystic + mural nodule *Cooper et al. Revised ATA Management Guidelines for Patients with Thyroid Nodules Thyroid 2009, 19:11 ATA suggests testing TSH and biopsy of those w/ normal or elevated levels RSB: Conclusions If 1 feature is used as indication for biopsy: most cases of cancer detected (sens 88%) with high false + rate (44%) and low + LR (2.0). 56 biopsies/ca If 2 features used as indication for biopsy: fewer cases detected (sens 52%) with lower false + rate (7%) and higher + LR (7.1). Only 16 biopsies/ca Compared with biopsy of all nodules > 5mm, this approach: (> 2 features to prompt biopsy) would reduce unnecessary biopsies by 90% while maintaining low risk of cancer (5/1000 patients for whom biopsy is deferred) 27
RSB: Recommendations and Practice Biopsy nodules with microcalcifications Biopsy nodules if larger than 2 cm and entirely solid OR just Biopsy when two features are present Nodules without these findings need not be biopsied or followed. There is no evidence that surveillance has any value. Patients in whom biopsy is deferred have a risk of cancer < 0.5% Summary Thyroid nodules are extremely common, > 50% of controls Fewer than 2% are cancer, 98% benign Unnecessary tissue sampling is invasive, costly, leads to repeated sampling and open surgical procedures due to inadequate sampling and non- diagnostic pathology Only 3 features: Microcalcifications Size > 2 cm Solid are statistically associated with cancer and these US features can be used to decide which nodules to biopsy 28
Thank you Vickie.Feldstein@ucsf.edu Rebecca.Smith- Bindman@ucsf.edu 29