Hierarchical Clustering of Human Papilloma Virus Genotype Patterns in the ASCUS-LSIL Triage Study

Similar documents
Can HPV-16 Genotyping Provide a Benchmark for Cervical Biopsy Specimen Interpretation?

Appropriate Use of Cytology and HPV Testing in the New Cervical Cancer Screening Guidelines

Vasile Goldiş Western University of Arad, Faculty of Medicine, Obstetrics- Gynecology Department, Romania b

Cytology/Biopsy/Leep Gynecologic Correlation: Practical Considerations and Approaches.

RESEARCH. Short term persistence of human papillomavirus and risk of cervical precancer and cancer: population based cohort study

Abstract. Human papillomavirus (HPV) DNA testing is cost-effective 1-3 (S. Kulasingam, PhD, et al, unpublished Atypical

News. Laboratory NEW GUIDELINES DEMONSTRATE GREATER ROLE FOR HPV TESTING IN CERVICAL CANCER SCREENING TIMOTHY UPHOFF, PHD, DABMG, MLS (ASCP) CM

Biomed Environ Sci, 2015; 28(1): 80-84

Human Papillomavirus Genotypes and the Cumulative 2-Year Risk of Cervical Precancer

Clinical Relevance of HPV Genotyping. A New Dimension In Human Papillomavirus Testing. w w w. a u t o g e n o m i c s. c o m

Cervical Cancer Screening for the Primary Care Physician for Average Risk Individuals Clinical Practice Guidelines. June 2013

Original Policy Date

!"#$%&'(#)*$+&,$-&.#,$/#0()1-$ ),1')$2(%&,2#,%$%(0'#$34567$

Clinical Policy Title: Fluorescence in situ hybridization for cervical cancer screening

Woo Dae Kang, Ho Sun Choi, Seok Mo Kim

Focus. International #52. HPV infection in High-risk HPV and cervical cancer. HPV: Clinical aspects. Natural history of HPV infection

Since the 1960s, colposcopy of the cervix with

HPV Testing & Cervical Cancer Screening:

Philip E. Castle, Diane Solomon, Mark Schiffman, Cosette M. Wheeler for the ALTS Group

ASCCP 2013 Guidelines for Managing Abnormal Cervical Cancer Screening Tests

The society for lower genital tract disorders since 1964.

The Absolute Risk of Cervical Abnormalities in High-risk Human Papillomavirus Positive, Cytologically Normal Women Over a 10-Year Period

Cervical Cancer Screening. David Quinlan December 2013

A Cytologic/Histologic Review of 367 Cases. Original Article. Cancer Cytopathology August 25,

Dysplasia: layer of the cervical CIN. Intraepithelial Neoplasia. p16 immunostaining. 1, Cervical. Higher-risk, requires CIN.

Making Sense of Cervical Cancer Screening

I have no financial interests in any product I will discuss today.

Clinical Guidance: Recommended Best Practices for Delivery of Colposcopy Services in Ontario Best Practice Pathway Summary

Objectives. I have no financial interests in any product I will discuss today. Cervical Cancer Screening Guidelines: Updates and Controversies

Acceptable predictive accuracy of histopathology results by colposcopy done by Gynecology residents using Reid index

Human Papillomaviruses and Cancer: Questions and Answers. Key Points. 1. What are human papillomaviruses, and how are they transmitted?

Cervical Screening for Dysplasia and Cancer in Patients with HIV

Atypical Glandular Cells of Undetermined Significance Outcome Predictions Based on Human Papillomavirus Testing

Cervical cancer prevention: Advances in primary screening and triage system

Understanding Your Pap Test Results

HPV Genotyping: A New Dimension in Cervical Cancer Screening Tests

Atypical squamous cells. The case for HPV testing

Supplementary Appendix

Cervical Cancer 4/27/2016

Performance of the Aptima High-Risk Human Papillomavirus mrna Assay in a Referral Population in Comparison with Hybrid Capture 2 and Cytology

No HPV High Risk Screening with Genotyping. CPT Code: If Result is NOT DETECTED (x3) If Results is DETECTED (Genotype reported)

HPV and Cervical Cancer, Screening and Prevention. John Ragsdale, MD July 12, 2018 CME Lecture Series

The LAST Guidelines in Clinical Practice. Implementing Recommendations for p16 Use

Faculty Pap Smear Guidelines: Family Planning Update 2008 Part Two

Absolute Risk of a Subsequent Abnormal Pap among Oncogenic Human Papillomavirus DNA-Positive, Cytologically Negative Women

Over-diagnoses in Cytopathology: Is histology the gold standard?

9/18/2008. Cervical Cancer Prevention for Adolescent Populations Garcia. Faculty disclosure. Objectives. HPV Positivity by Age (UK)

Human Papillomavirus

Currently, colposcopy is recommended for

Human Papillomavirus. Kathryn Thiessen, ARNP, ACRN The Kansas AIDS Education and Training Center The University of Kansas School of Medicine Wichita

SESSION J4. What's Next? Managing Abnormal PAPs in 2014

Lessons From Cases of Screened Women Who Developed Cervical Carcinoma

P16 et Ki67 Biomarkers: new tool for risk management and low grade intraepithelial lesions (LGSIL): be ready for the future.

HPV AND CERVICAL CANCER

Philip E. Castle, Patti E. Gravitt, Diane Solomon, Cosette M. Wheeler and Mark Schiffman

Clinical outcomes after conservative management of CIN1/2, CIN2, and CIN2/3 in women ages years

Clinical Practice Guidelines June 2013

Human Papillomavirus Testing Using Hybrid Capture II With SurePath Collection

Abnormal Cervicovaginal Cytology With Negative Human Papillomavirus Testing

I have no financial interests in any product I will discuss today.

chapter 4. The effect of oncogenic HPV on transformation zone epithelium

Your Colposcopy Visit

The Korean Journal of Cytopathology 15 (1) : 17-27, 2004

Management Algorithms for Abnormal Cervical Cytology and Colposcopy

HPV Molecular Diagnostics and Cervical Cytology. Philip E. Castle, PhD, MPH American Society for Clinical Pathology (ASCP) March 15, 2012

Negative Colposcopic Biopsy After Positive Human Papilloma Virus (HPV) DNA Testing False-Positive HPV Results or False-Negative Histologic Findings?

For the Atypical Squamous Cells of Undetermined Significance/Low-Grade Squamous Intraepithelial Lesion Triage Study Group

Screening for Cervical Cancer: Demystifying the Guidelines DR. NEERJA SHARMA

Cervical Precancer: Evaluation and Management

Cervical Testing and Results Management. An Evidenced-Based Approach April 22nd, Debora Bear, MSN, MPH

The devil is in the details

Development and Duration of Human Papillomavirus Lesions, after Initial Infection

Molecular Analysis in the Diagnosis and Management of Lesions of Uterine Cervix: The 95% solution. Mark H. Stoler, MD PSC Symposium USCAP 2008

Cervical Cancer Prevention in the 21 st Century Changing Paradigms

Colposcopy at a crossroads

PAP SMEAR WITH ATYPICAL SQUAMOUS CELLS OF UNDETERMINED SIGNIFICANCE

(Pap) results, ie, abnormal squamous cells of undetermined significance (ASCUS). According to

Opinion: Cervical cancer a vaccine preventable disease

CME/SAM. Follow-up Outcomes in a Large Cohort of Patients With Human Papillomavirus Negative ASC-H Cervical Screening Test Results

I have no financial interests in any product I will discuss today.

Cervical FISH Testing for Triage and Support of Challenging Diagnoses: A Case Study of 2 Patients

HPV Testing ASC-US. Jodie Zeke, a nurse practitioner, received initial CE2. 5. By Kim K. Choma, MSN, APN,C

Update on HPV Testing. Robert Schlaberg, M.D., Dr. med., M.P.H. Assistant Professor, University of Utah Medical Director, ARUP Laboratories

Cervical Cancer Screening

Comparison of Human Papillomavirus Distribution in Cytologic Subgroups of Low-Grade Squamous Intraepithelial Lesion

Beyond Pap Morphological Triage: p16/ki67 Dual Staining

Comparative study of human papilloma virus DNA detection and results of histopathological examination of cervical colposcopic biopsy

Welcome. THE ROLE OF oncofish cervical ASSESSMENT OF CERVICAL DYSPLASIA. March 26, 2013

HUMAN PAPILLOMAVIRUS TESTING

Cervical-Cancer Screening with Human Papillomavirus and Cytologic Cotesting

Natural History of HPV Infections 15/06/2015. Squamous cell carcinoma Adenocarcinoma

Cervical Cancer Screening Update. Melissa Hartman, DO Women s Health

HPV TESTING AND UNDERSTANDING VALIDITY: A tough row to hoe. Mark H. Stoler, MD ASC Companion Meeting USCAP 2008

The Biology of HPV Infection and Cervical Cancer

Chapter 14: Role of Triage Testing in Cervical Cancer Screening

Promoting Cervical Screening Information for Health Professionals. Cervical Cancer

Cervical Dysplasia and HPV

He Said, She Said: HPV and the FDA. Audrey P Garrett, MD, MPH June 6, 2014

Case Report How Colposcopy Misses Invasive Cervical Cancer: A Case Report from the IMPROVE-COLPO Study

Transcription:

Published OnlineFirst on October 19, 2010 as 10.1158/0008-5472.CAN-10-1188 Prevention and Epidemiology Hierarchical Clustering of Human Papilloma Virus Genotype Patterns in the ASCUS-LSIL Triage Study Cancer Research Nicolas Wentzensen 1, Lauren E. Wilson 1, Cosette M. Wheeler 2, Joseph D. Carreon 1, Patti E. Gravitt 3, Mark Schiffman 1, and Philip E. Castle 1 Abstract Anogenital cancers are associated with 13 carcinogenic human papilloma virus (HPV) types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common, which complicates the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the atypical squamous cells of undetermined significance low-grade squamous intraepithelial lesion triage study with the use of unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2,780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered with the use of complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: cluster 1 included all CIN3 histology with abnormal cytology; cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion cytology; cluster 3 included older women with normal or low-grade histology/ cytology and low viral load; and cluster 4 included younger women with low-grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: group 1 included only HPV16; group 2 included nine carcinogenic types, plus noncarcinogenic HPV53 and HPV66; and group 3 included noncarcinogenic types, plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and highgrade squamous intraepithelial lesion. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease with the use of unsupervised hierarchical clustering can address complex genotype distributions on a population level. Cancer Res; 70(21); 8578 86. 2010 AACR. Introduction More than 40 different types of human papilloma viruses (HPV) can infect the anogenital mucosa. Most of these types cause asymptomatic transient infections that may be associated with minor cytologic alterations, whereas approximately a dozen carcinogenic types can cause anogenital cancer (1). HPV16 is by far the most carcinogenic type. HPV18, 31, 33, and 45 follow, and together with HPV16 account for >90% of HPV-related cancers (2). Authors' Affiliations: 1 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland; 2 Department of Pathology, The University of New Mexico, Albuquerque, New Mexico; and 3 Departments of Epidemiology, and Molecular Microbiology and Immunology, Johns Hopkins University, Baltimore, Maryland Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/). Corresponding Author: Nicolas Wentzensen, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Room 5024, 6120 Executive Boulevard, Rockville, MD 20852-7234. Phone: 301-435-3975; Fax: 301-402-0916; E-mail: wentzenn@mail.nih.gov. doi: 10.1158/0008-5472.CAN-10-1188 2010 American Association for Cancer Research. Traditionally, cervical cancer was thought to arise through increasingly severe grades of cervical intraepithelial neoplasia (CIN), defined by the extent and severity of cellular atypia. However, CIN1 is now known to represent acute HPV infection, whereas CIN3 is precancer (and includes carcinoma in situ). At the transition between acute infection and precancer is an equivocal and poorly reproducible diagnosis called CIN2 (3), which probably represents a mixture of precancer and HPV infection. The cytologic correlates used in screening are high-grade squamous intraepithelial lesion (HSIL) corresponding to CIN2/3, or low-grade squamous intraepithelial lesion (LSIL) corresponding to CIN1. The most common equivocal cytologic abnormalities are called atypical squamous cells of undetermined significance (ASCUS). The multiple stages of cervical carcinogenesis are being redefined due to increased understanding of HPV natural history (1, 4). We distinguish acute HPV infection, a common and benign condition, from uncommon persistent infection that is the true risk factor for precancer and cancer. The prevalence of HPV types in the genital tract and the association of HPV types with different stages in CIN are related to multiple host and viral factors. HPV infection is easily transmitted by sexual contact, whereas poorly understood 8578 Cancer Res; 70(21) November 1, 2010

Hierarchical Clustering of HPV Genotype Patterns Figure 1. Consort diagram. Flow chart of individuals included in the analysis. Women were referred to ASCUS-LSIL triage study from four clinical centers with a cytology diagnosis of ASCUS (3,488 women enrolled) or LSIL (1,572 women enrolled). Women were randomized into three trial arms: immediate colposcopy, HPV triage, and conservative management. Our analysis included all women who received a colposcopy at the enrollment visit. In the immediate colposcopy group, all women received colposcopy. Women in the HPV triage arm underwent colposcopy if they were HPV-positive or received a cytology diagnosis of HSIL at their enrollment visit, and women in the conservative management arm underwent colposcopy only if they received an HSIL cytology diagnosis at enrollment. Our final analysis included 2,780 women. immunologic factors are related to viral clearance/persistence. The risk of viral persistence and associated development of precancer vary by type. Some noncarcinogenic types from the α3/α15 species are preferentially detected in the vaginal epithelium, whereas α7 types, such as HPV18 and HPV45, are frequently found in endocervical lesions (5). There is no convincing evidence for interaction between multiple cervical HPV infections. We wish to sort the different histologic and cytologic findings by their relationships with HPV natural history. However, HPV prevalence studies with the use of broad HPV genotyping have shown that multiple HPV infections are very common, especially in young women at the peak of their sexual activity (6, 7). Current genotyping assays allow detection of up to 40 HPV genotypes from the same sample, generating complex HPV genotyping data that complicate the attribution of individual HPV genotypes to grades of CIN and corresponding cytology. Thus far, the complexity has been mainly addressed by restricting analyses to single genotype infections, by attributing genotypes to disease hierarchically based on the HPV prevalence in cervical cancers, or by combining genotypes within phylogenetic species. We previously showed the wide range of potential type attribution to all stages of cervical disease that can only be resolved by genotyping of individual lesions on the cervix (7). The attribution of individual genotypes to cervical disease is further complicated by the imprecise ascertainment of cervical disease stages. Colposcopy and biopsy frequently miss prevalent precancer (8, 9), and even if the worst lesion is not missed, cytology and histology have both limited reproducibility, especially at the lower disease stages (10). For example, in a cross-sectional study, we showed that women with HSIL cytology and normal biopsy results had very similar HPV genotype patterns as women with biopsy-confirmed high-grade disease, suggesting that the worst lesion was frequently missed in colposcopy (11). To further our understanding of the spectrum of histologic and cytologic abnormalities in relationship to HPV types, we analyzed HPV genotype distributions in 20 disease categories based on cytology and histology, and show their relation to subsequent 2-year risk of CIN3 in a large clinical trial called the ASCUS-LSIL triage study. Materials and Methods Study population The ASCUS-LSIL triage study was a multicenter, randomized clinical trial conducted by the National Cancer Institute to compare three clinical management strategies for women referred with a community cytologic interpretation of ASCUS (n = 3,488) or LSIL (n = 1,572) cytology (12). At enrollment, cytology was repeated and HPV testing was done with the use of Hybrid Capture 2 (Digene Corporation, now Qiagen). Women in the immediate colposcopy arm were referred to colposcopy regardless of test results. In the HPV triage arm, women were referred to colposcopy if they had Hybrid Capture 2 positive or missing result at enrollment, or if their enrollment cytology was HSIL. Women in the conservative management arm received colposcopy only in the case of 4 ASCUS under the 1991 Bethesda system was slightly more inclusive, particularly of probable reactive changes and atypical squamous cells with possible HSIL, than the ASCUS category of the 2001 Bethesda system. www.aacrjournals.org Cancer Res; 70(21) November 1, 2010 8579

Wentzensen et al. an HSIL cytology result at enrollment. Women were followed for 2 years, with cytology follow-ups every 6 months. Women with a HSIL cytology result at any of these follow-up visits were referred to colposcopy. Our analysis included all women referred for either ASCUS 4 or LSIL who underwent colposcopy at enrollment and who had a cytology diagnosis in the enrollment period (n = 2,780; Fig. 1). Colposcopy was done by nurse practitioner colposcopists, general gynecologists, gynecology oncology fellows, or gynecologic oncologists. The type of medical training did not influence the sensitivity of colposcopy to detect CIN3+ in 2 years of follow-up (8). Before colposcopy, high-resolution photography of the cervix was done to evaluate visual screening and as additional colposcopy quality control. Figure 2. Hierarchical clustering of disease groups by genotype prevalence. Columns, disease combinations based on histology (no biopsy, normal, CIN1, CIN2, CIN3) and cytology (normal, ASCUS, LSIL, HSIL); N, number of women in each disease combination; rows; HPV genotypes. The heatmap shows type-specific prevalence determined by line blot assay in gray scale for each type-disease stage combination. Darker gray, higher prevalence. Dendrograms are shown for both disease groups and HPV genotypes. 1 to 4, four disease clusters; 1 to 3, three genotype clusters; 2a to c, and 3a and b, subgroups. 8580 Cancer Res; 70(21) November 1, 2010 Cancer Research

Hierarchical Clustering of HPV Genotype Patterns At each study visit, a pelvic exam was done, and two cervical specimens were collected. One specimen was preserved in PreservCyt (Cytyc, now Hologic) for cytology and Hybrid Capture 2 testing, and the second was preserved in specimen transport medium (Qiagen). The National Cancer Institute and local Institutional Review Boards approved the study, and all participants provided written informed consent. HPV genotyping Line blot assay was done on enrollment specimen transport medium specimens as previously described for the detection of 27 individual (HPV6, 11, 16, 18, 26, 31, 33, 35, 39, 40, 42, 45, 51-59, 66, 68, 73, and 82-84) HPV types (13). A subset of specimens was retested by linear array, a commercialized version of line blot assay that tests for 37 HPV genotypes, including 26 detected by line blot assay as previously described (14). Residual PreservCyt (Hologic) specimens, after being used for liquid-based cytology, were tested by Hybrid Capture 2 (Qiagen), a pooled-probe, signal amplification DNA test that targets a group of 13 carcinogenic HPV types. Pathology and treatment Treatment was based on cytologic and histologic diagnoses made by the clinical center pathologists as described previously. For quality control purposes, the Pathology Quality Control Group at Johns Hopkins Hospital reviewed referral smears, ThinPreps, and histology slides, and provided secondary diagnoses. Excisional treatment by the loop electrosurgical excision procedure was offered to any women receiving a clinical center histology diagnosis of CIN2 or worse, or a quality control diagnosis of CIN3 or worse. At the time of study exit, all women with persistent mild cervical abnormalities were offered treatment by loop electrosurgical excision procedure. Statistical methods Women were separated into 20 categories of cervical disease, formed by crossing the enrollment histology diagnosis with the cytology result. Disease combinations were consistently labeled with the use of the following format: 'histology result'/'cytology result.' Histology diagnoses were drawn from the quality control histology diagnosis when available. If a subject did not have a quality control histology diagnosis for the enrollment period, which was rare, her clinical center enrollment histology diagnosis was used. Histology diagnoses were classified as Normal, CIN1, CIN2, CIN3, or 'No Biopsy' (because no biopsy was taken at colposcopy, indicating a negative colposcopic impression). Enrollment cytology interpretation categories were Normal, ASCUS (including atypical squamous cells with possible HSIL), LSIL, or HSIL. With the use of the HPV infection status of each individual from the line blot assay results, we calculated type-specific HPV infection frequencies at enrollment for each diagnosis category. Also for each category, we calculated 2-year risk of CIN3+ diagnosis and described some other clinical and demographic data. To examine patterns of HPV infection in these 20 diagnosis categories, we used hierarchical clustering to compare HPV genotype patterns in each disease group. We used complete linkage and a Euclidean distance metric. We simultaneously clustered both disease combinations and HPV genotypes, and created dendrograms to visualize the clustering with the use of the TreeView software. For sensitivity analyses, we also did the same hierarchical clustering: (a) with the use of the linear array results instead of the line blot assay results (linear array results were only available in women referred into the ASCUS-LSIL triage study with an ASCUS Pap), and (b)restricted to single-type HPV infections. We also recategorized the women based on worst 2-year histology and enrollment cytology, and did the same analyses to examine the effect of misclassification of disease at the time of enrollment on the HPV frequency patterns. We confirmed the clustering results by doing a k-means cluster analysis specifying three and four clusters. Data analysis was done with the use of SAS version 9.1. Cluster 3.0 was used for clustering of HPV frequency arrays, and Java TreeView was used to construct frequency maps and dendrograms (15, 16). Results HPV genotypes in disease groups Supplementary Table S1 displays the HPV genotype prevalence in all histology/cytology combinations. In 18 of the 20 groups, the most frequent genotype was HPV16, the exceptions being the groups normal/lsil and CIN1/normal. Within each histology category, the women with HSIL cytology had the highest frequency of HPV16 infection; HPV16 frequency also increased with increasing histologic severity. Of 165 women with CIN3/HSIL, 117 (71%) were infected with HPV16. Multiple infections were very common in this population but varied significantly between disease groups from an average number of 0.74 types in women with normal/normal to an average number of 2.38 types in women with CIN2/LSIL. Across all histologic categories, most infections were found in women with LSIL cytology (Supplementary Table S1). Clustering of disease groups by genotype frequencies Unsupervised hierarchical clustering of disease groups by genotype frequencies yielded a tree with four major histology/ cytology clusters (Fig. 2; Table 1). Cluster 1 included CIN3/ASCUS, CIN3/LSIL, and CIN3/HSIL. Cluster 2 included CIN3/normal, all less-severe histology with HSIL except for CIN1/HSIL, and CIN2/normal. Cluster 3 included normal histology or no biopsywithnormal,ascus,orlsilcytology.cluster4 included CIN2/ASCUS, CIN2/LSIL, normal/lsil, and all CIN1 except for CIN1/normal. We observed very similar clustering of disease combinations when restricting the analysis to women referred for ASCUS cytology only with the use of either line blot assay or linear array HPV genotyping data, although some histology/ cytology combinations had <20 cases and were excluded. When restricting to cases with single-type HPV infections only, many combinations had only very small numbers and generated unstable clusters. K-means clustering specifying four disease clusters reproduced the same grouping www.aacrjournals.org Cancer Res; 70(21) November 1, 2010 8581

Wentzensen et al. Table 1. Two-year risk of high-grade disease and clinical characteristics of disease groups Histology_cytology group N Dendrogram cluster 2-y risk CIN3+ (%) 2-y risk CIN2+ (%) 2-y risk confirmed CIN3 (%) >25 y at enrollment (%) Normal_LSIL 352 4 6 10.2 3.7 33 CIN1_LSIL 280 4 5.7 12.1 3.2 31.1 CIN1_ASCUS 95 4 6.3 8.4 3.2 41.1 CIN1_HSIL 62 4 9.7 22.6 6.5 21 CIN2_LSIL 87 4 9.2 100 5.6 28.7 CIN2_ASCUS 52 4 1.9 100 0 21.2 Normal_normal 670 3 2.8 4.7 1.3 53.5 Normal_ASCUS 376 3 2.9 6.4 1.1 45.5 No biopsy_normal 34 3 8.8 14.7 2.9 29.4 CIN1_normal 73 3 1.4 4.1 0 49.3 No biopsy_lsil 63 3 12.7 15.9 6.4 44.4 No biopsy_ascus 31 3 3.2 9.7 3.2 61.3 CIN2_normal 20 2 10 100 5 35 Normal_HSIL 109 2 11.9 19.2 8.3 43.1 CIN2_HSIL 81 2 9.9 100 8.6 33.3 No biopsy_hsil 20 2 10 15 10 35 CIN3_normal 25 2 100 100 56 52 CIN3_ASCUS 29 1 100 100 44.8 20.7 CIN3_LSIL 53 1 100 100 45.3 35.9 CIN3_HSIL 165 1 100 100 78.2 37 (Continued on the following page) pattern that was observed with unsupervised hierarchical clustering. Clustering of genotypes by genotype frequencies in disease groups HPV genotypes were clustered in three major groups (Fig. 2; Table 2): HPV16 clustered separately from all other HPV genotypes, driven by its high frequency across all disease stages (HPV cluster 1). Nine carcinogenic types, plus HPV66 and HPV53, were included in HPV cluster 2; the first subgroup included only α9 types, the second had mainly α7 types, and the third was dominated by α6 and α9 types. All remaining noncarcinogenic types, as well as HPV33 and HPV45, were included in HPV cluster 3; there was no specific distribution of HPV clades in the three subgroups of cluster 3. HPV genotypes in the first and second HPV clusters showed a differential distribution across disease categories, with their lowest prevalence in the disease cluster 3 and the highest prevalence in the disease cluster 4. Within the third HPV group, we observed 2 subgroups: (a) a subset of HPV genotypes with very low prevalence and no disease-specific distribution (HPV11, HPV26, HPV40, and HPV57 in HPV cluster 3a), and (b) 11 types with higher prevalence evenly distributed across all disease clusters (HPV clusters 3b+c). Exclusion of noncarcinogenic types had only a minor effect on disease clustering, whereas exclusion of carcinogenic types produced a completely different clustering of disease combinations, suggesting that the clustering was driven by carcinogenic types. Overall, the HPV clusters were less distinct than the disease clusters, as indicated by the distance metrics and visualized by the flat branching in the HPV genotype dendrogram. Clinical characteristics and risk of subsequent CIN3 in disease clusters We studied 2-year risk of CIN3 within the individual disease groups and within disease clusters (Fig. 3; Table 1). The 2-year risk of CIN3 differed significantly between the clusters (P < 0.001). Cluster 1 included only women with CIN3. Women in cluster 2 (including some women with CIN3 histology) had a 20% risk of CIN3 on average. The risk was much lower for women in cluster 4 (6.5%) and lowest for women in cluster 3 (3.5%). The clustering of women with normal or no histology and HSIL cytology in clusters with high-grade histology and cytology is also reflected by their higher risk of developing CIN3: among women with normal histology at enrollment, 16.5% of those with HSIL enrollment cytology (cluster 2) had the worst 2-year histology result of CIN3+ compared with 3.3% of women who had normal enrollment cytology (cluster 3; P < 0.001). We also studied the clinical and demographic characteristics across the four disease clusters (Fig. 3). Women in cluster 3 were oldest with a median age of 27 years, whereas women in cluster 4 were youngest with a median age of 23 years, similar to clusters 1 and 2 (median age of 8582 Cancer Res; 70(21) November 1, 2010 Cancer Research

Hierarchical Clustering of HPV Genotype Patterns Table 1. Two-year risk of high-grade disease and clinical characteristics of disease groups (Cont'd) Median age Current smoker (%) High-grade colposcopy (%) HSIL, CC (%) High-grade cervigram (%) RLU median HC2+ 23 33.5 6.8 6.8 1.5 346.6 93.2 23 41.8 8.9 10.4 2.9 808.4 98.5 24 35.8 10.5 4.3 3.2 151.1 89 22 38.7 24.2 56.5 5.2 466.8 98.3 23 40.7 25.3 20.7 6 653 100 23 42.3 21.2 13.5 1.9 161.8 98 27 28.8 5.1 0.6 0.5 0.5 42.3 24 32.4 5.6 1.9 1.6 12.7 69.7 27 26.5 0 0 0 0.4 14.7 25 30.6 9.6 4.1 0 14.3 85.3 23 44.4 0 11.1 1.6 588.5 72.6 26 38.7 0 0 0 0.9 71 23.5 30 15 5 0 4.1 70 24 49.5 15.6 54.1 2.8 112.5 92.2 23 45.7 38.3 76.5 6.3 226.2 100 24.5 45 0 50 5.3 106.1 85 24 56 36 24 4 23.7 75 23 37.9 27.6 17.2 10.3 80.3 93.1 23 54.7 35.9 32.1 11.5 472.7 100 24 55.8 54.6 84.2 22.5 240.5 100 NOTE: Disease groups are ordered according to the dendrogram from the heatmap in Fig. 2. Disease groups are based on histology (no biopsy, normal, CIN1, CIN2, CIN3) and cytology (normal, ASCUS, LSIL, HSIL). The columns present 2-year risk of CIN3, 2-year risk of CIN2, 2-year risk of confirmed CIN3 (CIN3 with a consensus clinical center and quality control diagnosis), percent of women >25 years at enrollment, median age, percent of women smoking at enrollment, percent of women with high-grade colposcopy at enrollment, percent of women with a clinical center HSIL cytology result, percent of women with a positive cervigram, median relative light units (RLU)/positive control value in HC2, and percent of women with a positive HC2 result. All percentages and median values are given for the individual histology/cytology combinations and for the four disease clusters. Abbreviations: CC, clinical center; HC2, Hybrid Capture 2. 24 y; P < 0.001). Semiquantitative viral load determined by Hybrid Capture 2 signal strength was highest in cluster 4 and lowest in cluster 3 (P < 0.001). The frequency of highgrade colposcopy was highest in cluster 1, followed by clusters 2 and 4, and lowest in cluster 3 (P < 0.001), a pattern confirmed for the review of cervigram images (P <0.001). The frequency of current smoking increased with increasing severity of disease in a cluster (1 > 2 > 4 > 3), further suggesting that smoking is indeed a cofactor for the development of precancerous lesions. Discussion In this analysis, we used a novel approach to study complex HPV genotype patterns in cervical disease and to address the misclassification of cervical disease stages defined by histology and cytology. Both histology and cytology are subjective methods with limited reproducibility. Colposcopic biopsy frequently misses the worst lesion on the cervix and augments the problem of disease misclassification (8, 9). Here, we did unsupervised hierarchical clustering of HPV genotyping data to agnostically define disease groups with similar HPV genotype patterns. We identified four disease clusters based on their unique HPV genotype distributions. The first cluster included women with CIN3 and HSIL, LSIL, or ASCUS cytology. The second cluster included mainly women with HSIL and/or CIN2. The third cluster included only women without high-grade histology or cytology and with low levels of viral infection, whereas the fourth cluster included women with mild-to-moderate dysplasia and cytologic signs of active viral infections (ASCUS and LSIL). We confirmed a previous finding that women with HSIL cytology but without histologically confirmed high-grade biopsy results clustered with those with histologically confirmed high-grade disease (11). We now show that these women also have a high risk of being detected with CIN3+ in the subsequent 2 years, corroborating that colposcopy-biopsy missed some prevalent precancer. Women grouped in the four clusters had distinct clinical characteristics: women in the first cluster all had prevalent CIN3, and accordingly, they were the most likely to have visual and microscopic evidence of abnormalities and higher viral load. Women in the second cluster had the second highest risk of CIN3, had fewer abnormal cervical impressions, www.aacrjournals.org Cancer Res; 70(21) November 1, 2010 8583

Wentzensen et al. Table 2. Clustering and characteristics of HPV genotypes HPV genotype Prevalence Cluster Phylogenetic range (%) clade/no. WHO carcinogen classification 16 7.8-71.5 1 α9 1 31 4.0-30.0 2a α9 1 52 3.2-19.7 2a α9 1 18 17.0 2b α7 1 39 18.9 2b α7 1 51 19.9 2b α5 1 35 15.4 2c α9 1 58 13.5 2c α9 1 53 13.7 2c α6 2B 56 17.5 2c α6 1 66 15.0 2c α6 2B 59 12.5 2c α7 1 26 5.0 3a α5 2B 11 5.0 3a α10 2B 57 0.0-0.0 3a α4 2B 40 8.2 3a α8 2B 45 1.4-11.8 3b α7 1 54 10.3 3b α13 2B 55 10.0 3b α10 2B 06 9.5 3b α10 2B 82 7.7 3b α5 2B 68 7.4 3b α7 2A 83 6.4 3b α3 2B 33 10.3 3c α9 1 42 10.3 3c α1 2B 84 6.9 3c α3 2B 73 11.1 3c α11 2B NOTE: The prevalence range is given for all disease combinations shown in Supplementary Table S1. Cluster indicates the HPV genotype cluster identified in Fig. 2. Phylogenetic clade is based on (19); WHO carcinogen classification is based on (17). and lower viral load. Women in cluster 3 were slightly older, had the lowest number of HPV infections across all types, few abnormal cervical impressions, very low viral load, and the lowest risk of CIN3. In contrast, women in the fourth cluster were slightly younger, had the highest number of infections, the highest viral load, and an intermediate risk of CIN3, lower than those in cluster 2. These results were obtained with the use of HPV genotyping for 27 HPV types without applying any weighting or hierarchical attribution by genotype (e.g., giving more weight to carcinogenic types). The four clusters separate out distinct groups with different characteristics, partly reflecting different stages of the natural history of HPV-related disease: younger women are likely to have multiple infections that are mainly productive rather than transforming, and associated with high viral load (cluster 4). Older women have fewer infections and lower viral load if there is no prevalent disease (cluster 3). Women with prevalent precancer have fewer infections with many carcinogenic types (most importantly HPV16) and may have high viral load (clusters 1 and 2). Admittedly, the age range in the ASCUS-LSIL triage study population is limited, and the difference in median age between clusters 3 and 4 is very small. It is important to note that the risk prediction for women with CIN2 detected at baseline is limited by censoring because most women with CIN2 had a loop electrosurgical excision procedure, interrupting the natural history. The counterintuitive grouping of CIN2-ASCUS and CIN2-LSIL in disease cluster 4, whereas CIN2-normal is grouped in disease cluster 2, is driven by higher prevalence of genotypes from HPV cluster 3, which mainly includes noncarcinogenic types. Restricting the clustering to carcinogenic types led to a closer grouping of CIN2 disease groups. The second dimension of cluster analysis identified three major clusters of HPV genotypes: HPV16 clustered separately from all other HPV genotypes and showed the closest association to risk of CIN3 among all types. The second HPV cluster included nine high-risk types, plus HPV53 and HPV66, and was found at higher frequencies in disease clusters 1, 2, and 4, whereas the third HPV cluster included mainly noncarcinogenic types, plus HPV33 and HPV45, and showed frequencies distributed more evenly across the disease clusters. Although HPV33 and HPV45 had the highest prevalence in CIN3/HSIL in cluster 3, their more uniform distribution in comparison with other carcinogenic types most likely caused their grouping with noncarcinogenic types. The carcinogenicity of HPV53, HPV66, and HPV68 has been widely debated. In the most recent IARC classification, HPV53 and HPV66 were considered possibly carcinogenic (cluster 2b), whereas HPV68 was classified as probably carcinogenic (cluster 2a) because of experimental and phylogenetic evidence but without strong supporting epidemiologic data (17, 18). The clustering of HPV53 and HPV66 with carcinogenic types in our analysis reflects their ability to cause a spectrum of mild and more severe precursor lesions, including CIN3, but little or no chance of invasion. The correlation of genotype clustering with the phylogenetic clades (19) and the WHO carcinogen classification (17) was quite good (Table 2), and we think that the remaining discrepancies are mainly related to the lack of cancers in the disease spectrum we analyzed. Still, whereas our approach can indicate which genotypes are important in the progression to precancer, it cannot identify the causal type in a multiple infection. Ultimately, lesion-specific genotyping is required to precisely attribute HPV genotypes to cervical precancer. With our analytic approach, we were able to show the complex relation of HPV genotypes and cervical disease in 2,780 women from the ASCUS-LSIL triage study in a single figure. As exemplified in our study, it is possible to use this technique to address disease misclassification. For example, the same approach can be used to address the heterogeneity of CIN2, a very diverse category that, depending on the local histologic interpretation, may include a lot of low-grade diseases (more likely to be in cluster 4) or that is more similar to 8584 Cancer Res; 70(21) November 1, 2010 Cancer Research

Hierarchical Clustering of HPV Genotype Patterns Figure 3. Disease group dendrogram with 2-year risk of CIN3 and clinical characteristics. The dendrogram from the heatmap in Fig. 2 is shown with the corresponding disease combinations based on histology (no biopsy, normal, CIN1, CIN2, CIN3) and cytology (normal, ASCUS, LSIL, HSIL). Columns: 2-year risk of CIN3, 2-year risk of CIN2+, 2-year risk of confirmed CIN3 (CIN3 with a consensus clinical center and quality control diagnosis), percent of women >25 years at enrollment, median age, percent of women smoking at enrollment, percent of women with high-grade colposcopy at enrollment, percent of women with a clinical center HSIL cytology result, percent of women with a positive cervigram, median relative light units (RLU)/positive control value in Hybrid Capture 2 (HC2), and percent of women with a positive HC2 result. All percentages and median values, respectively, are given for the four disease clusters. CIN3 (more likely to be in cluster 2). Our clustering approach allows studying type attribution to disease in different regions of the world by comparing type allocation with genotype clusters. Similarly, it can be used to analyze shifts in type attribution to disease in vaccinated populations. Due to the frequent misclassification of cervical disease, the disease groups used in the analysis are a combination of true results, cases in which the worst lesion was missed in colposcopy, and cases in which histology and/or cytology was undercalled or overcalled. Despite the common notion that cervical histology is the gold standard of disease ascertainment, in our analysis, cytology was an important indicator of subsequent risk of CIN3+ within the group of normal/ low-grade histology, reflected by similar HPV genotype patterns as found in histology-confirmed high-grade disease. In summary, we present a novel solution to the handling of complex HPV genotype data in cervical disease and applied it in the prospective ASCUS-LSIL triage study. We show that HPV genotype patterns at various cervical disease stages are complex but distinctive when analyzed in aggregate. Our approach allows easily displaying HPV prevalence in disease groups and may be used to compare the distribution of HPV genotypes within diagnostic categories between different populations. Disclosure of Potential Conflicts of Interest No potential conflicts of interest were disclosed. Acknowledgments We thank Digene, Cytyc, National Testing Laboratories, DenVu, TriPath Imaging, and Roche Molecular Systems for donating or providing at reduced cost some of the equipment and supplies used in the ASCUS-LSIL triage study trial. Grant Support Intramural Research Program of the NIH and the National Cancer Institute. Roche Molecular Systems provided reagents and research support to the laboratory of P.E. Gravitt. C.M. Wheeler received support through her institution from Roche Molecular Systems for HPV genotyping studies. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received 04/09/2010; revised 07/17/2010; accepted 08/06/2010; published OnlineFirst 10/19/2010. References 1. Schiffman M, Castle PE, Jeronimo J, Rodriguez AC, Wacholder S. Human papillomavirus and cervical cancer. Lancet 2007;370:890 907. 2. Smith JS, Lindsay L, Hoots B, et al. Human papillomavirus type distribution in invasive cervical cancer and high-grade cervical lesions: a meta-analysis update. Int J Cancer 2007;121:621 32. 3. Carreon JD, Sherman ME, Guillen D, et al. CIN2 is a much less reproducible and less valid diagnosis than CIN3: results from a histological review of population-based cervical samples. Int J Gynecol Pathol 2007;26:441 6. 4. Wright TC, Jr., Schiffman M. Adding a test for human papillomavirus DNA to cervical-cancer screening. N Engl J Med 2003;348: 489 90. 5. Castle PE, Jeronimo J, Schiffman M, et al. Age-related changes of the cervix influence human papillomavirus type distribution. Cancer Res 2006;66:1218 24. 6. Kjaer SK, Breugelmans G, Munk C, Junge J, Watson M, Iftner T. Population-based prevalence, type- and age-specific distribution of HPV in women before introduction of an HPV-vaccination program in Denmark. Int J Cancer 2008;123:1864 70. 7. Wentzensen N, Schiffman M, Dunn T, et al. Multiple human www.aacrjournals.org Cancer Res; 70(21) November 1, 2010 8585

Wentzensen et al. papillomavirus genotype infections in cervical cancer progression in the study to understand cervical cancer early endpoints and determinants. Int J Cancer 2009;125:2151 8. 8. Gage JC, Hanson VW, Abbey K, et al. Number of cervical biopsies and sensitivity of colposcopy. Obstet Gynecol 2006;108:264 72. 9. Pretorius RG, Zhang WH, Belinson JL, et al. Colposcopically directed biopsy, random cervical biopsy, and endocervical curettage in the diagnosis of cervical intraepithelial neoplasia II or worse. Am J Obstet Gynecol 2004;191:430 4. 10. Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL triage study. JAMA 2001;285:1500 5. 11. Wentzensen N, Schiffman M, Dunn ST, et al. Grading the severity of cervical neoplasia based on combined histopathology, cytopathology, and HPV genotype distribution among 1,700 women referred to colposcopy in Oklahoma. Int J Cancer 2009;124:964 9. 12. Schiffman M, Adrianza ME, ASCUS-LSIL Triage Study. Design, methods and characteristics of trial participants. Acta Cytol 2000; 44:726 42. 13. Gravitt PE, Peyton CL, Apple RJ, Wheeler CM. Genotyping of 27 human papillomavirus types by using L1 consensus PCR products by a single-hybridization, reverse line blot detection method. J Clin Microbiol 1998;36:3020 7. 14. Castle PE, Gravitt PE, Solomon D, Wheeler CM, Schiffman M. Comparison of linear array and line blot assay for detection of human papillomavirus and diagnosis of cervical precancer and cancer in the atypical squamous cell of undetermined significance and low-grade squamous intraepithelial lesion triage study. J Clin Microbiol 2008; 46:109 17. 15. de Hoon MJ, Imoto S, Nolan J, Miyano S. Open source clustering software. Bioinformatics 2004;20:1453 4. 16. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 1998;95:14863 8. 17. Bouvard V, Baan R, Straif K, et al. A review of human carcinogenspart B: biological agents. Lancet Oncol 2009;10:321 2. 18. Schiffman M, Clifford G, Buonaguro FM. Classification of weakly carcinogenic human papillomavirus types: addressing the limits of epidemiology at the borderline. Infect Agent Cancer 2009;4:8. 19. Schiffman M, Herrero R, Desalle R, et al. The carcinogenicity of human papillomavirus types reflects viral evolution. Virology 2005; 337:76 84. 8586 Cancer Res; 70(21) November 1, 2010 Cancer Research

Hierarchical Clustering of Human Papilloma Virus Genotype Patterns in the ASCUS-LSIL Triage Study Nicolas Wentzensen, Lauren E. Wilson, Cosette M. Wheeler, et al. Cancer Res Published OnlineFirst October 19, 2010. Updated version Supplementary Material Access the most recent version of this article at: doi:10.1158/0008-5472.can-10-1188 Access the most recent supplemental material at: http://cancerres.aacrjournals.org/content/suppl/2010/10/08/0008-5472.can-10-1188.dc1 E-mail alerts Sign up to receive free email-alerts related to this article or journal. Reprints and Subscriptions Permissions To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at pubs@aacr.org. To request permission to re-use all or part of this article, use this link http://cancerres.aacrjournals.org/content/early/2010/10/07/0008-5472.can-10-1188. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.