HEATHER ANN PRENTICE ELIZABETH E. BROWN, COMMITTEE CHAIR RICHARD A. KASLOW JIANMING TANG NICHOLAS M. PAJEWSKI KUI ZHANG A DISSERTATION

Size: px
Start display at page:

Download "HEATHER ANN PRENTICE ELIZABETH E. BROWN, COMMITTEE CHAIR RICHARD A. KASLOW JIANMING TANG NICHOLAS M. PAJEWSKI KUI ZHANG A DISSERTATION"

Transcription

1 HIGH DENSITY GENOTYPING FOR IMMUNOGENETIC POLYMORPHISMS ASSOCIATED WITH TRANSMISSION AND CONTROL OF HIV-1 INFECTION IN ZAMBIAN HETEROSEXUAL SERODISCORDANT COUPLES by HEATHER ANN PRENTICE ELIZABETH E. BROWN, COMMITTEE CHAIR RICHARD A. KASLOW JIANMING TANG NICHOLAS M. PAJEWSKI KUI ZHANG A DISSERTATION Submitted to the graduate faculty of The University of Alabama at Birmingham, in partial fulfillment of the requirements for the degree of Doctor of Philosophy BIRMINGHAM, ALABAMA 2013

2 Copyright by Heather Ann Prentice 2013 ii

3 HIGH DENSITY GENOTYPING FOR IMMUNOGENETIC POLYMORPHISMS ASSOCIATED WITH TRANSMISSION AND CONTROL OF HIV-1 INFECTION IN ZAMBIAN HETEROSEXUAL SERODISCORDANT COUPLES HEATHER ANN PRENTICE DOCTOR OF PHILOSOPHY IN EPIDEMIOLOGY ABSTRACT Understanding host genetic correlates of HIV-1 outcomes in sub-saharan African populations is of continued importance due to the disproportionate disease burden in this population. We sought to identify genetic variants associated with HIV-1 acquisition in a sample of 439 initially seronegative individuals enrolled in the Zambia-Emory HIV Research Project, as well as novel variants associated with viral load (VL) in a sample of 172 seroconverters (SCs) with an estimated date of infection and 449 seroprevalent individuals (SPs). We were unable to detect any statistically significant associations in our analysis of HIV-1 acquisition, though one signal in the HLA-DOA gene associated with accelerated time-to-transmission (rs592625) approached significance at a p value threshold of 2.8x10-5 (HR=1.6, p=4.0x10-4 ). We observed a signal in the NOTCH4 gene (rs ) and increased set-point VL in SCs that was consistent with two prior studies. Furthermore, rs in the HLA-DOB gene was associated with increased VL in SPs. We also identified several novel variants and haplotypes within the IL10 gene cluster on chromosome 1 associated with both set-point VL and chronic VL. None of these novel associations in the IL10 gene cluster were statistically significant after accounting for multiple testing. We were able to confirm a number of prior reported associations in other populations in our analysis of SCs, including three variants intergenic between IL20 and iii

4 IL24: rs (β=-0.19, p=0.010), rs (β=-0.17, p=0.029), and rs (β=- 0.17, p=0.025). This dissertation demonstrates the value of a targeted genotyping platform designed for fine-mapping and replication in the discovery of novel genetic signals for three different HIV-1 phenotypes. If confirmed, associations reported here within the NOTCH4 and the HLA-DO genes may warrant further investigation in the search for genetically informed therapies that could help to prevent HIV-1 infection or help to treat the disease after infection. Keywords: epidemiology, HIV-1, acquisition, viral load, major histocompatibility complex (MHC), interleukin-10 (IL10) iv

5 DEDICATION This dissertation is dedicated to my parents Kevin and Cheryl, and siblings Mathew and Kristina, for their support in all my educational endeavors. I would also like to dedicate this dissertation to my dear friends who have been like a family to me while living so far from my parents and siblings to accomplish my educational pursuits. v

6 ACKNOWLEDGEMENTS I would like to thank my entire committee for their support throughout my dissertation work. I am grateful to my mentors, Drs. Richard Kaslow, Jianming Tang, and Elizabeth Brown for their continued guidance without which, this dissertation would not have been possible. I am also grateful to Drs. Nicholas Pajewski and Kui Zhang for their statistical expertise and generous contributions to this research. I would also like to thank Travis Porter and Drs. Wei Song and Aimee Merino who performed most of the genotyping necessary for this project. This work was supported by multiple grants, including R01 AI , R37 AI , R01 AI from the NIAID, and funding from the International AIDS Vaccine Initiative (IAVI). This work was further supported by the IAVI Protocol C research network. I am indebted to all investigators, staff, and participants in the Zambia-Emory HIV Research Project who made this study possible. In particular, I thank Ilene Brill for data management. vi

7 TABLE OF CONTENTS Page ABSTRACT... iii DEDICATION... v ACKNOWLEDGEMENTS... vi LIST OF TABLES... ix LIST OF FIGURES... xi LIST OF ABBREVIATIONS...xiii INTRODUCTION... 1 Background... 1 Host Genetic Factors and HIV-1 Acquisition... 5 Host Genetic Factors and Viral Load... 7 STUDY OBJECTIVE AND STUDY AIMS... 9 MATERIALS AND METHODS Subjects Study Data ImmunoChip Genotype Calling and Quality Control Population Stratification Linkage Disequilibrium Human Subjects Statistical Methods HIV-1 DYNAMICS: A REAPPRAISAL OF HOST AND VIRAL FACTORS, AS WELL AS METHODOLOGICAL ISSUES GENETIC CORRELATES WITHIN THE HUMAN EXTENDED MAJOR HISTOCOMPATIBILITY COMPLEX AND HIV-1 ACQUISITION vii

8 APPLICATION OF FINE-MAPPING WITHIN THE EXTENDED HUMAN MAJOR HISTOCOMPATIBILITY COMPLEX IN IDENTIFICATION OF NOVEL VARIANTS WITH VIRAL CONTROL DURING TWO DISTINCT STAGES OF HIV-1 INFECTION POLYMORPHISMS IN THE IL10 GENE CLUSTER AND HIV-1 VIRAL LOAD IN EARLY AND CHRONIC INFECTION CONCLUSIONS GENERAL LIST OF REFERENCES APPENDIX A COMPARISON OF SUBJECTS INCLUDED FOR EACH ANALYSIS AND REMAINING ELIGIBLE SUBJECTS WITHIN ZEHRP NOT INCLUDED FOR ANALYSIS Table 1. Demographic, genetic, and virologic characteristics of seroconverting individuals (SCs) and exposed seronegative individuals (ESNs) analyzed compared to eligible individuals from ZEHRP not included for analysis Table 2. Demographic, genetic, and virologic characteristics of seroconverting individuals (SCs) analyzed compared to eligible SCs from ZEHRP not included for analysis Table 3. Demographic, genetic, and virologic characteristics of seroprevalent individuals (SPs) analyzed compared to eligible SPs from ZEHRP not included for analysis B INSTITUTIONAL REVIEW BOARD APPROVAL FORM viii

9 LIST OF TABLES Table Page INTRODUCTION 1 Overview of prior genome-wide association studies on HIV-1 outcomes Comparison of number of polymorphisms for selected genes on the Illumina 1M-Duo and ImmunoChip arrays HIV-1 DYNAMICS: A REAPPRAISAL OF HOST AND VIRAL FACTORS, AS WELL AS METHODOLOGICAL ISSUES 1 Host genetic factors that are positively or negatively associated with HIV-1 viral load (VL) set-point or assumed set-point, as reported in recent studies Viral markers that are associated with HIV-1 set-point viral load (VL), as reported in recent studies GENETIC CORRELATES WITHIN THE HUMAN EXTENDED MAJOR HISTOCOMPATIBILITY COMPLEX AND HIV-1 ACQUISITION 1 Overall characteristics of 212 seroconverters (SCs) and 227 exposed seronegatives (ESNs) with SNP genotyping results Summary of additive Cox proportional hazard models for HIV-1 acquisition Supplemental Tables 1 Univariable results of covariates included in multivariable models for transmission analysis of HIV-1 acquisition Comparison of overall characteristics for seroconverters (SCs) and exposed seronegative (ESNs) included for multivariable analysis versus individuals excluded ix

10 APPLICATION OF FINE-MAPPING WITHIN THE EXTENDED HUMAN MAJOR HISTOCOMPATIBILITY COMPLEX IN IDENTIFICATION OF NOVEL VARIANTS WITH VIRAL CONTROL DURING TWO DISTINCT STAGES OF HIV-1 INFECTION 1 Overall characteristics of 172 seroconverters (SCs) and 449 seroprevalent subjects (SPs) with SNP genotyping results Summary of regression analyses for geometric mean set-point viral load in HIV-1 seroconverters Summary of HyperLasso results for Box-Cox transformed log 10 geometric mean viral load in seroconverters Summary of regression analyses for viral load in HIV-1 seroprevalent individuals Summary of HyperLasso results for Box-Cox transformed log 10 viral load (VL) in seroprevalent individuals Supplemental Tables 1 Summary of null permutations to calibrate scale parameter of HyperLasso model Univariable results of covariates in seroconverter (SCs) and seroprevalent (SPs) multivariable viral load (VL) analyses POLYMORPHISMS IN THE IL10 GENE CLUSTER AND HIV-1 VIRAL LOAD IN EARLY AND CHRONIC INFECTION 1 Overall characteristics of 172 seroconverters (SCs) and 449 seroprevalent subjects (SPs) with SNP genotyping results Summary of models for log 10 transformed geometric mean set-point viral load in seroconverters Summary of models for log 10 transformed viral load in seroprevalent partners Haplotype analysis of log 10 transformed geometric mean set-point viral load in seroconverters Haplotype analysis of log 10 transformed viral load in seroprevalent individuals x

11 LIST OF FIGURES Figure Page INTRODUCTION 1 Multidimensional scaling analysis of a) ZEHRP cohort and samples from HapMap3 and b) a close examination of clustering for samples with African ancestry Distribution of a) seroconverter (SC) and b) seroprevalent (SP) viral loads before and after transformation HIV-1 DYNAMICS: A REAPPRAISAL OF HOST AND VIRAL FACTORS, AS WELL AS METHODOLOGICAL ISSUES 1 Selection of recent (post-2010) publications for systematic review GENETIC CORRELATES WITHIN THE HUMAN EXTENDED MAJOR HISTOCOMPATIBILITY COMPLEX AND HIV-1 ACQUISITION 1 Minus log p value (MLP) plot for analysis of HIV-1 acquisition within the major histocompatibility complex Kaplan-Meier curve of time-to-transmission (in weeks) according to rs genotypes (a variant in the UTR region of the HLA-DOA gene) APPLICATION OF FINE-MAPPING WITHIN THE EXTENDED HUMAN MAJOR HISTOCOMPATIBILITY COMPLEX IN IDENTIFICATION OF NOVEL VARIANTS WITH VIRAL CONTROL DURING TWO DISTINCT STAGES OF HIV-1 INFECTION 1 Associations of single nucleotide polymorphisms within the extended major histocompatibility complex with Box-Cox transformed log 10 viral load in seroconverters Associations of single nucleotide polymorphisms within the extended major histocompatibility complex with Box-Cox transformed log 10 viral load in seroprevalent individuals xi

12 POLYMORPHISMS IN THE IL10 GENE CLUSTER AND HIV-1 VIRAL LOAD IN EARLY AND CHRONIC INFECTION Supplemental Figures 1 Haplotype blocks detected for seroconverters in a) IL10, IL19, IL20, and IL24 and b) IL10RA Haplotype blocks detected for seroprevalent individuals in a) IL10, IL19, IL20, and IL24 and b) IL10RA xii

13 LIST OF ABBREVIATIONS AIDS AIMS ART CCR CTL CVCT DOF EDI ESN FDR GWAS HAART acquired immunodeficiency syndrome ancestry informative markers antiretroviral therapy chemokine receptor cytotoxic T-lymphocyte couple s voluntary counseling and testing duration of follow up estimated date of infection exposed yet still seronegative individual false discovery rate genome-wide association study highly active antiretroviral therapy HIV-1 human immunodeficiency virus type 1 HLA IL KIR LD MHC MDS NK human leukocyte antigen interleukin killer immunoglobulin-like receptor linkage disequilibrium major histocompatibility complex multidimensional scaling natural killer cell xiii

14 OI SC SP SNP VL ZEHRP opportunistic infection seroconverting individual seroprevalent individual single nucleotide polymorphism HIV-1 RNA viral load Zambia-Emory HIV Research Project xiv

15 1 INTRODUCTION Background. Control of the global HIV/AIDS epidemic has gained considerable momentum in recent years, thanks to increased proliferation of preventive strategies and access to antiretroviral therapy (ART). The incidence of new human immunodeficiency virus type 1 (HIV-1) infections and deaths related to acquired immunodeficiency syndrome (AIDS) have been declining since the late 1990s, though the disease is by no means eradicated. At the end of 2011, an estimated 34 million people worldwide were living with HIV-1. Currently, about 2.5 million new HIV-1 infections and 1.7 million AIDS-related deaths are expected each year 1. In particular, sub-saharan Africa bears a disproportionate burden of the epidemic 1, with almost 23.5 million adults and children living with HIV/AIDS and accounting for almost 70 percent of new HIV-1 infections and AIDS-related deaths. In sub-saharan Africa, HIV-1 is primarily transmitted by unprotected sex within heterosexual relationships, although transmission does also occur horizontally from mento-men and vertically from mother-to-child 1. In the first few weeks after HIV-1 infection, rapid replication of the virus leads to a dramatic increase of HIV-1 RNA plasma in the newly infected partner which is then followed by a steep decline until a viral set-point is reached 2-6. RNA VL levels then remain relatively constant and can stay this way for years in clinical latency. Progression to AIDS is characterized by increasing levels of RNA VL and CD4 + T cell loss through either cell destruction or decreased production, increasing the risk for opportunistic diseases to develop due to

16 2 diminished immune capacity. Diagnosis in an HIV-1-positive individual is made when cell counts decline below 200 cells/mm 3 of blood, T-lymphocyte percentage of total lymphocytes is less than 14 percent, or at least one opportunistic infection is observed 7-9. Both HIV-1 susceptibility and immune control have striking inter-individual variability, due to a number of host, environmental, and viral factors 7, Much attention has been paid to host genetic and immunologic factors with the hope of identifying clinical biomarkers that can be used in screening tests to identify those predisposed to infection and accelerated disease progression, as well as vaccines that take advantage of host defenses 23,24. In contrast to HIV-1, which readily mutates to attain resistance, host genes are stable, thus the identification of host genetic factors that affect susceptibility and immune control is crucial for the development of an effective vaccine. Historically, candidate gene studies have been utilized, but these studies require prior rationale for studying the gene of interest and have typically led to inconsistent results in the literature. Genome-wide association studies (GWAS) have more recently come into focus because of the efficiency provided to study variation across the entire human genome with a hypothesis free approach. It has been hypothesized that roughly 20 percent of the phenotypic variability in HIV-1 viral load and disease control has been explained through candidate gene studies and GWAS; with a much lower level of variability explained for HIV-1 susceptibility While the remaining unexplained variation likely entails components due to environmental or viral factors, it is likely that host genetic factors contributing to inter-individual variability remain to be identified 23,24,27.

17 3 There may be other facets of genetic variation yet to be defined. Genotyping arrays for standard GWAS are designed to include single nucleotide polymorphisms (SNPs) which tag for regions across the entire genome. The tagging SNP is typically not likely to be the actual causative variant, rather it is the hope that any association found from a tagging SNP is due to the given SNP being in linkage disequilibrium (LD) with the actual casual variant 28. In contrast to GWAS, fine-mapping provides more dense coverage for certain regions of the genome and may provide a new avenue for identification of genetic variants previously missed associated with HIV-1 acquisition and immune control. Of note, the majority of research into complex trait genetics (not just for HIV-1) has been designed for and focused on populations of European ancestry rather than sub- Saharan African populations where HIV-1 risk and disease burden is greatest 29,30. Genetic diversity differs across populations, with greater diversity analogous to the age of the population, implying that associations found in one population may not be applicable to other populations 25,30,31. There is greater genetic diversity and shorter spans of LD between genetic markers among African populations; therefore, tagging SNPs in traditional GWAS arrays designed for populations of European descent may not be optimal for studies in African populations. Increased coverage through fine-mapping may help to more readily find associations that have been previously missed due to varying patterns of genetic diversity between ethnic populations. As a result of the shorter spans of LD, one advantage of fine-mapping for African populations is that actual causal variants may be more readily identified 25,29. Focused research on genetic variability within sub-saharan African populations is imperative, there is potential to discover novel

18 4 loci associated with HIV-1 susceptibility and immune control which can then be utilized for targeted clinical screening and vaccine development purposes. Genetic determinants for HIV-1 acquisition and immune control can be classified into two broad categories: genes involved in the viral life cycle and genes that govern the host immune response 32,33. Research on cell entry has been directed primarily to genes encoding for chemokine receptors (CCR and CXCR) and their ligands (MIP1α, MIP1β, RANTES, and SDF-1), as well as cytokines (i.e. IL-2, IL-4, IL-10, and γifn) Chemokine receptors act as co-receptors to CD4 + molecules and are needed for the HIV- 1 virus to enter an immune cell. The majority of HIV-1 isolates are non-syncytiuminducing strains, requiring C-C receptors, encoded by the CCR5 and CCR2 genes, for cell entry. Syncytium-inducing strains tend to arise at later stages of infection and gain entry though C-X-C receptors. Chemokine ligands compete with HIV-1 for binding on receptors, thereby blocking viral entry. The CCR5 ligands, MIP1α, MIP1β, RANTES, are encoded by CCL3, CCL4, and CCL5, respectively. SDF-1 is the ligand for CXCR4, which is encoded by CXCL12. Depending on the cell type, cytokines work to either upregulate or downregulate chemokine and chemokine receptor expression 38. Interleukin (IL)-2, encoded by the IL2 gene, works to increase expression of CCR5 on T cells while IL-4, encoded by the IL4 gene, acts to decrease expression. Interestingly, γifn and IL-10 both can upregulate and downregulate expression; genes encoding γifn and IL-10 are IFNG and IL10, respectively. Genetic research on the host immune response elicited by HIV-1 infection has targeted the major histocompatibility complex (MHC) region of the genome, with particular focus on genes encoding human leukocyte antigen (HLA) class I and class II

19 5 molecules. The MHC contains the most polymorphic loci in humans and gene products from this area function for both innate and acquired immune responses 39. HLA-I and HLA-II molecules are instrumental in the acquired immune response, these molecules bind to antigenic epitopes for presentation to CD8 + T cells and CD4 + T cells respectively. Within HLA-I three different types of molecules, HLA-A, -B, and -C, bind to epitopes of intracellular pathogens to present to CD8 + T cells, eliciting a cytotoxic T cell response 34,39. Similarly, three different types of molecules within HLA-II, HLA-DR, -DQ, and - DP, bind to extracellular pathogenic epitopes for presentation to CD4 + T cells to activate an antibody mediated response 39. The diversity of HLA genes allows different alleles to bind to different pathogenic epitopes because of variation in the peptide-binding groove regions of the HLA molecule Common alleles also tend to differ between racial/ethnic populations, and associations with alleles and HIV-1 infection and immune control appear to vary depending on the type of population of interest Although located outside of the MHC, killer immunoglobulin-like receptor (KIR) genes have also been instrumental in research on the host immune-response because of the intimate relationship of their products with HLA-I molecules. KIR molecules are expressed on natural killer (NK) cells and act to inhibit or activate the innate NK cell cytotoxic action by binding to HLA-I molecules expressed on other cells; KIR bind to either HLA-B or HLA-C molecules depending on the type of immunoglobulin domain on the extracellular portion of the KIR molecule, 3D and 2D respectively 34,50. Host Genetic Factors and HIV-1 Acquisition. There is substantially less evidence for genetic associations with HIV-1 transmission and acquisition compared to that for immune control. This is primarily due to difficulties in prospectively following

20 6 individuals until disease transmission; most individuals are only identified after transmission has occurred. In addition, individuals with the highest risk for infection also typically exhibit non-genetic risk factors, creating a potential for confounding within the context of observational studies 34. While longitudinal studies of heterosexual serodiscordant couples are a desirable design for examining this issue, very few studies have been able to take advantage of such a structure because of the need for extended follow-up. Because of these difficulties in study design, the influence of immuneregulated genes, such as HLA and KIR, on acquisition is not clear. Many associations reported in single cohorts have yet to be verified in subsequent studies 34,51. At a genomewide level, two studies, both in African populations, assessing HIV-1 transmission yielded no markers with statistically significant associations (Table 1) 52,53. One metaanalysis of multiple European cohorts noted an association with rs , a SNP within the CYP7B1 gene, but further investigation is necessary to validate whether this gene has a functional role in HIV-1 acquisition 54. The most widely recognized genetic factor related to HIV-1 acquisition is the 32- base pair deletion in the CCR5 gene, also known as CCR5-Δ32 (HHG*2), where individuals homozygous for this deletion are almost completely resistant to infection and those who are heterozygous have delayed acquisition CCR5-Δ32 produces a nonfunctional chemokine receptor, preventing cell entry of the virus; however this deletion is only observed at a low frequency in populations of European descent. One Zambian study looking at CCR2-CCR5 haplotypes, found the HHD/HHE diplotype to be associated with accelerated HIV-1 transmission 64. Of the chemokine ligands, deleterious associations have been found for RANTES through variants within the CCL5 gene (-

21 7 403A, ln1.1c, and 3 222C) 24,65. Increased copy number variation in the CCL3L1 gene encoding the isoform MIP1αP has also been associated with delayed HIV-1 acquisition in certain cohorts, most likely through increasing levels of MIP1αP which has a higher affinity for CCR5 compared to MIP1α variants, yet the association has not been consistently observed across multiple populations 36, Host Genetic Factors and Viral Load. HIV-1 dynamics: a reappraisal of host and viral factors, as well as methodological issues 69 provides a review of recent population-based research assessing the relationship between various host genes and viral load (VL). Eleven GWAS focusing on HIV-1 outcomes have been conducted, six considering VL as a phenotype. Key findings are summarized in Table 1. Two SNPs have consistently been identified in studies of VL control, though not always statistically significant after adjustment for covariates: rs located within the HCP5 gene and rs located upstream of HLA-C The HCP5 SNP, and potentially the HLA- C5 SNP, appear to be in complete LD with HLA-B*57:01, the best known genetic factor associated with HIV-1 outcomes 70,72,76. A single SNP, rs in HLA-B, appears to be associated with VL control in an African-American cohort and is in complete LD with B*57:03, the equivalent of B*57:01 in African populations 77. One study to utilize finemapping of HLA genotypes in an African American cohort reiterated the importance of HLA-B, specifically for amino acid positions 97, 116, and The HIV-1 virus has a well-established relationship with components of the human immune system, both for infection and viral replication, therefore using the ImmunoChip with its dense coverage of immunogenetic variants for analysis may be

22 8 beneficial in identifying some of the unexplained variation in HIV-1 outcomes. The aim of this study was to take advantage of the unique coverage provided by the ImmunoChip, a customized genotyping platform designed to permit fine-mapping and deep-replication of established GWAS variants relevant to autoimmune and inflammatory diseases, and utilize it in a population poorly represented in the literature with the hope of identifying novel loci for further host genetic research.

23 9 STUDY OBJECTIVE AND SPECIFIC AIMS Study objective. Following earlier evidence from the Zambia-Emory HIV Research Project cohort of HIV-1 serodiscordant couples, this study attempts to evaluate fine-mapping of variants in genes contained in the human MHC and their variants as independent contributors to HIV-1 acquisition and immune control. Aim 1. Identify, confirm, and refine genetic markers within the human MHC as factors associated with HIV-1 acquisition. Aim 2. Identify, confirm, and refine genetic markers within the human MHC as factors associated with immune control of HIV-1. Aim 3. Identify, confirm, and refine genetic markers in the IL10 gene cluster as factors associated with immune control of HIV-1.

24 10 MATERIALS AND METHODS Subjects. We analyzed ImmunoChip data for a selection of ART naïve individuals enrolled in the Zambia-Emory HIV Research Project (ZEHRP), a prospective study started in 1995 offering couple s voluntary counseling and testing (CVCT) 79 to heterosexual couples in Lusaka, Zambia. The design and structure for this cohort has been described in detail previously 21,80,81. Briefly, the ZEHRP cohort includes couples recruited by community outreach workers through three sites in Lusaka. Couples were provided with pretest counseling followed by HIV-1 and syphilis testing, posttest counseling, and syphilis treatment when needed. Heterosexual HIV-1 serodiscordant couples identified through CVCT who were cohabitating in Lusaka for at least nine months were included into the cohort study. Upon enrollment, couples were followed during quarterly visits where information was collected on sexual exposures with STI screening, interim medical history and physical examination, and HIV-1 serology for the seronegative partner. Confirmatory blood testing was completed for seronegative partners with positive HIV-1 serology. In couples where transmission occurred, epidemiological linkage of the virus was examined by sequence analysis and phylogenetic tree analysis for both partners. Couples with sequence distances below the distance range of the reference sequences were classified as linked, and couples with sequence distances within the distance range of the reference sequences were classified as unlinked.

25 11 Of the couples enrolled in the cohort, 253 exposed seronegative (ESNs) and 242 seroconverting (SCs) individuals with clinical information and biological samples were selected for analyses of HIV-1 acquisition. For analyses of VL control, 242 SCs and 482 seroprevalent (SP) individuals with VL measurements were selected. Individuals were predominately infected with HIV-1 clade C. Appendix A provides comparisons between subjects included for each analysis after application of exclusion criteria (described in Genotype Calling and Quality Control and Population Stratification sections) and the remaining subjects from the ZEHRP cohort not included for analysis. Study Data. Data for analysis included patient demographics, ImmunoChip and HLA genotyping, and virologic and clinical parameters. Major outcomes in this study included HIV-1 acquisition and immune control as described herein. For the acquisition analysis, initially serodiscordant couples were enrolled and observed longitudinally over the 14 year study period and classified into two categories ( transmission pairs and nontransmission pairs ). For immune control, HIV-1 RNA VL was obtained for individuals according to two classifications differing with respect to the stage of infection ( set-point VL for recent SCs and chronic VL for SPs). VL is an established surrogate marker for immune control that can be measured during the early stages of infection for clinical application and projection of disease course 60. HIV-1 RNA measurements were made using the Roche Amplicor 1.0 assay (Roche Diagnostics Systems Inc., Branchburg, NJ) with a lower limit of detection at less than 400 copies/ml of plasma in a laboratory certified by the virology quality assurance program of the AIDS Clinical Trials Group.

26 12 Additional data available relevant to this study included: age, gender, KIR genotyping, HLA-B position two signal peptides, VL in the index partner (SP VL for the acquisition analysis), number of study visits, duration of follow-up (DOF) and the estimated date of infection (EDI). ImmunoChip. Content for the ImmunoChip was based on prior GWAS of autoimmune and inflammatory diseases and additional preliminary sequence information available from the 1000 Genomes Project 82. The ImmunoChip contains 718 small deletions and 195,806 SNPs for a total of 196,524 polymorphisms 83,84. In contrast to a standard GWAS array that includes markers spanning the whole genome, the ImmunoChip includes markers within the MHC and other genetic regions with a known relationship to immune function 85. Although polymorphisms for HIV-1 outcomes were not directly included on the ImmunoChip as part of its design, many genetic regions of interest based on prior association studies of HIV-1 are still well represented on the ImmunoChip (Table 2). Genotyping and Quality Control. HLA class I and class II genotyping was completed to the first four digits using a combination of PCR-based methods including PCR with sequence-specific primers (SSP) (Dynal/Invirtrogen, Brown Deer, WI), automated sequence-specific oligonucleotide (SSO) probe hybridization (Innogenetics, Alpharetta, GA), and sequencing-based typing (SBT) (Abbott Molecular, Inc., Des Plaines, IL) using capillary electrophoresis and the ABI 3130xl DNA Analyzer (Applied Biosystems, Foster City, CA). Genotype calling was initially performed using the default GENCALL algorithm in Illumina s Beadstudio software. There was some concern about the accuracy of

27 13 inferred genotypes for the ImmunoChip as the array specifically targets a number of rare variants. Accurately inferring rare variants using algorithms such as GENCALL can be problematic since calling algorithms typically rely upon clustering techniques in comparison to a reference training data set 86. Rare variants are typically not well represented in these reference datasets. In contrast, the BEAGLECALL algorithm unifies genotype inference with estimation of haplotypic phase, which can help to rectify spurious genotype calls within the context of LD. This process can help to remove genotype artifacts that may lead to false-positive associations in downstream analyses 87. In general, we found the genotype calls to be highly concordant between GENCALL and BEAGLECALL. Amongst SNPs mapping to the extended MHC (using the boundaries of rs to rs ), the overall concordance rate between the two algorithms was In all subsequent analyses, we use the genotypes inferred by BEAGLECALL. We excluded samples with a high degree of missing data (call rates <0.95) and those where genetically determined sex (based on genotypes from the X and Y chromosomes) did not match with self-reported sex from clinical data. We also excluded SNPs with call rates <0.985 or those exhibiting significant deviation from Hardy- Weinberg equilibrium (p <10-6 ). For the entire ImmunoChip, 181,732 autosomal SNPs passed through this initial quality control filter. Cryptic relatedness was checked using the robust kinship estimation procedure implemented in the KING software package 89. The algorithm implemented in KING is notably robust to the presence of population structure (kinship estimates typically assume a homogenous population), permitting reliable relationship inference up to approximately third degree relatives in multi-ethnic samples. For pairs of individuals estimated to be

28 14 third degree relatives or greater, we used the unrelated selection procedure described in Manichaikul et al. and implemented in KING 90. Population Stratification. Population stratification was assessed using multidimensional scaling (MDS) implemented in KING 90,91. A number of ancestry informative markers (AIMS) informative for African and Native American ancestry have been included on the ImmunoChip for this purpose. We merged the ImmunoChip data with data from the 11 populations included in Phase 3 of the International HapMap Project (N=1,184 unrelated individuals), which were genotyped on the Illumina 1M and Affymetrix 6.0 arrays 92. For the MDS analyses, we utilized approximately 30,000 SNPs outside the extended MHC that were annotated with rsids on the ImmunoChip and where strand ambiguities could be reliably resolved when comparing to the HapMap data. As shown in Figure 1, the inferred MDS axes suggest that the content of the ImmunoChip is sufficient for assigning global ancestry. The clustering of the HapMap3 populations based on the top three MDS axes is very similar to that reported by Pemberton et al. which was based on genome-wide genotypes 93. When looking only at the populations with some degree of African ancestry, the MDS analysis somewhat differentiates the ZEHRP cohort from the other African ancestry populations included in HapMap3 (YRI, LWK, MKK, and ASW) (Figure 1b). The only deficiency appears to be that the MDS analysis is unable to separate the ZEHRP cohort from the YRI population group, which represents Yorubans from Ibadan, Nigeria. This deficiency is not entirely unexpected as the ImmunoChip was not specifically designed for use in African populations. Linkage Disequilibrium. SNPs identified during analysis were further examined for LD with previously identified variants (SNPs or HLA alleles) to see if the SNP

29 15 identified was likely to be tagging an already known association. Penalized regression models (described in Statistical Methods section) were used to tease apart independent association signals within the context of the extensive LD present within the human MHC. Human Subjects. All study procedures and consent forms have been approved through the University of Alabama at Birmingham Institutional Review Board, the University Teaching Hospital Research Ethics Committee in Lusaka, and the Office of Protection from Research Risks of the National Institutes for Health. All original studies were reviewed and approved by the University of Alabama at Birmingham Institutional Review Board (Appendix C). Statistical Methods. A major aim of this study was to assess the effect of individual variants on HIV-1 acquisition and immune control independently from established genetic variants (particularly HLA alleles). Such inference is naturally complicated by the extensive LD present within the human MHC, creating a substantial degree of collinearity. This goal was approached using two complementary methodologies. First, single variants were tested in additive multivariable regression models adjusting for alleles previously reported to be associated with the outcome of interest in addition to clinical risk factors. In acquisition models, additional variants adjusted for included HLA-A*68:02 in the initially seronegative partner and A*36 and KIR2DS4 in the index partner 94,95. In VL models, HLA variants adjusted for included HLA-A*74, B*13, B*57, B*81, and DRB1*01: The second approach was using penalized regression models, which were implemented using all ImmunoChip variants within the given region of interest, as well

30 16 as the HLA variants included in the single variant models above. Several authors have demonstrated the promising performance of penalized regression approaches such as the LASSO or ridge regression to identify important genetic predictors (either SNPs or gene expression measurements) in the presence of LD/collinearity Intuitively, these approaches operate by building models that simultaneously consider all available predictors. In genetics settings, such models will typically be over-specified, involving more predictors than observations, so-called p>n scenarios. However, by inducing a penalty on the regression coefficients, penalized approaches typically display greater predictive accuracy and can exhibit higher power as they consider multiple genetic markers simultaneously, at the expense of producing downwardly biased coefficient estimates. As an example, Vignal et al. recently used penalized logistic regression based on a Normal-Exponential-Gamma (NEG) prior (see Hoggart et al. 98 ) specifically to finemap risk variants for rheumatoid arthritis within the MHC 103. The NEG prior is parameterized in terms of shape and scale parameters that need to be chosen to properly calibrate the type I error rate of the HyperLasso model. We set the shape parameter to 1 and analyzed 100 null permutations of each study group over a grid of scale parameter values. The null datasets were created by randomly pairing phenotypes/non-genetic covariates with genotypes (thus maintaining the relationship between the non-genetic covariates and their corresponding VLs). We then selected a value for the scale parameter that approximately produced a mean error rate of selecting 1 SNP into the regression model. SNPs were modeled assuming an additive effect, and were only included if the minor allele was observed at least 10 times (either heterozygote or homozygote genotypes) in the particular dataset.

31 17 PLINK 104 and SAS (Cary, NC) were both employed for analysis of individual statistical models and to detect patterns of LD between variants. Analysis for penalized regression was implemented using the HyperLasso package of Hoggart et al. and custom scripts written in the R Statistical Computing Environment. Cox proportional hazards models were used to test the association between single variants and HIV-1 acquisition. General linear models were used to test the association between variants and immune control. The normality assumption for general linear analysis of log 10 transformed VL was validated by a Kolmogorov-Smirnov test (Figure 2). Ordinal logistic regression categorizing VL into three levels (<10,000 copies/ml, 10, ,000 copies/ml, and >100,000 copies/ml) was also considered for VL analysis. Two methods were used to account for multiple hypothesis testing of the associations between HIV-1 acquisition and VL control: q-values using a false discovery rate (FDR) of 0.2 and the Bonferroni correction. SimpleM was used to calculate the effective number of independent tests for the Bonferroni corrected p value 105,106.

32 18 a. b. FIGURE 1. Multidimensional scaling analysis of a) ZEHRP cohort and samples from HapMap3 and b) a close examination of clustering for samples with African ancestry.

33 FIGURE 2. Distribution of a) seroconverter (SC) and b) seroprevalent (SP) viral loads before and after transformation. 19

34 20 TABLE 1. Overview of prior genome-wide association studies on HIV-1 outcomes. Reference HIV-1 Outcome Data Source, Measures Total N Ethnicity, Country Genotype Array Genes/SNP Identified (effect on phenotype) Fellay, 2007 Euro-CHAVI Cohort, 486 Dalmasso, 2008 Limou, 2009 Le Clerc, 2009 Fellay, 2009 Herbeck, 2010 Pelak, 2010 Progression, VL set point and time to CD4 count <350 Progression, VL set point and HIV controllers Progression, LTNP Progression, rapid progression (CD4) Progression, VL set point and time to CD4 count <350 or <500+ART Progression, time to clinical AIDS and VL set point Progression, VL set point ANRS PRIMO Cohort, 605 Caucasian; Italy, Australia, Denmark, Spain, Switzerland, and United Kingdom Caucasian, France Illumina HumanHap550 BeadChip Illumina Sentrix Human Hap300 BeadChip GRIV, 361 Caucasian, France Illumina HumanHap300 BeadChip GRIV, 360 Caucasian, France Illumina HumanHap300 Euro-CHAVI Cohort and MACS, 2362 MACS, ALIVE, SFCC, MHCS, 746 Dod HIV NHS and MACS Cohorts, 515 Caucasian; Italy, Australia, Denmark, Spain, Switzerland, United Kingdom, and United States Caucasian, United States African American, United States BeadChip Illumina s HumanHap550 BeadChip + special 8,000 HLA SNP chip and Human 1M BeadChip Affymetrix GeneChip Human Mapping 500K Array Set Illumina s HumanHap 1M,1M-Duo DNA, or HumanHap 550K BeadChips HCP5: rs ( VL); 5' HLA-C: rs ( VL); ZNRD1 and RNF39: rs , rs , rs , rs , rs , rs , rs ( CD4) 5' HLA-C: rs ( VL); HLA-B: rs ( VL); HCP5: rs ( VL); TNXB: rs and rs ( VL); TNF: rs ( VL) HCP5: rs ( LTNP); ZNRD1 and RNF39: rs and rs ( LTNP) PRMT6: rs and rs ( CD4); RXRG: rs ( CD4); SOX5: rs ( CD4); TGFBRAP1: rs ( CD4) HCP5: rs ( VL); 5' HLA-C: rs ( VL); C6orf12: rs259919; TRIM10: rs ; 3' HLA-B: rs ; NOTCH4: rs ; ZNRD1 and RNF39: rs , rs , rs , rs , rs , rs , rs ( CD4) PROX1: rs ( time to AIDS); HLA-B: rs ( time to AIDS); CCR2/CCR5: rs916093( time to AIDS); HCP5: rs ( VL); AIF1: rs ( VL) No genome-wide significant associations; HLA-B: rs ( VL)

35 21 Pereyra, 2010 Petrovski, 2011 Lingappa, 2011 Limou, 2012 Progression, elite controllers Susceptibility, HIV seroconversion Susceptibility and progression, HIV seroconversion and VL set point Susceptibility Int l HIV Controllers and AIDS Clinical Trials Group, 3622 CHAVI Cohort, 1379 Partners HSV/HIV Study and COS, 798 GRIV, DESIR, ACS, 1837 Caucasian, African American, and Hispanic; United States, Canada, Western Europe, Australia African, Malawi African; Rwanda, Botswana, Kenya, South Africa, Tanzania, Zambia, and Uganda Caucasian; France, Illumina s HumanHap 650Y and Human 1M- Duo BeadChip Illumina s HumanHap 1M and 1M-Duo DNA Analysis BeadChip Illumina HumanHap 1M-Duo BeadChip Illumina HumanHap300 BeadChip European, 5' HLA-C: rs ( VL), HCP5: rs ( VL), MICA: rs ( VL), PSORS1C3: rs ( VL); African American, HLA-B: rs ( VL), HCP5: rs ( VL), HLA-B: rs ( VL), HCG22: rs ( VL); Hispanic: HLA-B: rs ( VL) No genome-wide significant associations; IL18: rs ( seroconversion) No genome-wide significant associations; IRF1: rs ( seroconversion); MBL2: rs ( seroconversion) CYP7B1: rs ( seroconversion) Netherlands LTNP = long term nonprogression, Euro-CHAVI = The European Center for HIV/AIDS Vaccine Immunology Consortium, GRIV = Genomics of Resistance to Immunodeficiency Virus, MACS = Multicenter AIDS Cohort Study, ALIVE = AIDS Link to the Intravenous Experience, SFCC = San Francisco City Clinic Cohort Study, MHCS = Multicenter Hemophilia Cohort Study, CHAVI = Center for HIV/AIDS Vaccine Immunology Clinical Core, COS = Couples Observational Study, DESIR = Data from an Epidemiological Study on Insulin Resistance syndrome, ACS = Amsterdam Cohort Study Bold = statistically significant results

36 TABLE 2. Comparison of number of polymorphisms for selected genes on the Illumina 1M-Duo and ImmunoChip arrays. * Chromosome Gene 1M-Duo ImmunoChip 1 IL IL IL IL ZNRD RNF C6orf TRIM HCG PSORS1C HLA-C HLA-B MICA HCP TNF AIF TNXB NOTCH IL10RA IL10RB 91 4 * Total number includes coding, complex, intergenic, intron, and untranslated regions for each gene. Genes are sorted by position on the given chromosome. 22

37 23 HIV-1 DYNAMICS: A REAPPRAISAL OF HOST AND VIRAL FACTORS, AS WELL AS METHODOLOGICAL ISSUES by HEATHER A. PRENTICE AND JIANMING TANG Viruses, October 2012, p , Vol. 4, No. 10 Copyright 2012 by Viruses Used by permission Format adapted for dissertation

38 24 Abstract The dynamics of HIV-1 viremia is a complex and evolving landscape with clinical and epidemiological (public health) implications. Most studies have relied on the use of set-point viral load (VL) as a readily available proxy of viral dynamics to assess host and viral correlates. This review highlights recent findings from population-based studies of set-point VL, focusing primarily on robust data related to host genetics. A comprehensive understanding of viral dynamics will clearly need to consider both host and viral characteristics, with close attention to (i) the timing of VL measurements, (ii) the biology of viral evolution, (iii) compartments of active viral replication, (iv) the transmission source partner as the immediate past microenvironment, and (v) proper application of statistical models. 1. Introduction HIV-1 infection typically occurs through a single viral variant 1-5, but the initial viral homogeneity is rather transient as the surviving viruses must mutate to evade host immune defenses or to regain fitness lost during adaptation to the immediate past host (the transmission source partner) 6. At the population level, HIV-1 subtypes responsible for the global AIDS pandemic can vary by geographic region 7,8, while frequent superinfection can generate mosaic viruses (circulating recombinant forms) to promote viral diversity 8. Understanding the evolution of HIV-1-host interactions requires close attention to both viral and host (immunologic) dynamics 9. HIV-1 viral load (VL) set-point is a well-studied phenotype tied to virus-host equilibrium, with high set-point VLs translating to rapid disease progression and fast

39 25 transmission to susceptible hosts 18,19. In many individuals, the viral set-point is reached within weeks of infection 12,20,21, and it can remain relatively steady (±0.5 log 10 RNA copies/ml) for years during clinical latency 10. Progression to AIDS is usually accompanied by (i) rising VL, (ii) substantial loss of CD4+ T-cells in peripheral blood, and (iii) risk for opportunistic infections and malignancies. AIDS diagnosis based on < 200 CD4 cells/mm 3 of blood and at least one opportunistic infection 22,23 can serve as another important phenotype for measuring the dynamics of host-virus interactions, but it can take close to a decade to develop even during untreated HIV-1 infection. In the era of highly active antiretroviral therapy (HAART), AIDS diagnosis is increasingly rare, so a focus on studying set-point VL as a proxy of viral fitness under specific microenvironment in the host is well justified, especially since many clinical decisions must be made during the early stages of HIV-1 infection 9,24. HIV-1 VL was, in one way or another, a subject in over 2,600 articles published since January 2010 (Figure 1). Our review here intends to highlight recent populationbased research on host and viral correlates of HIV-1 VL set-point or its equivalent. For clarity and fair comparisons, studies assessing the relationship between host and/or viral factors on early set-point VL were selected according to two phenotypes, i.e.; set-point and chronic VLs as continuous or categorical endpoints. In addition, it was necessary to exclude studies dealing with children or youth (rare) or with small sample sizes (<100 HAART-naïve subjects). In the end, a total of 22 original research articles remained after four rounds of selection (Figure 1). Interpretation of these recent studies is relatively straightforward when supporting evidence from earlier reports is available.

40 26 2. Host Genetics and Set-Point VL 2.1 Human Leukocyte Antigen (HLA) Class I and Class II Genes as Prominent Factors HLA molecules mediate immune responses through multiple mechanisms, and their importance to effective immune control of HIV-1 infection has been well publicized in the past two decades 9, Polymorphisms around the peptide-binding groove of HLA class I (HLA-I) and HLA-II molecules determine the specificity of cytotoxic T- lymphocyte responses (CTLs) and T-helper cell epitopes, respectively 28. Direct interactions between HLA-I and killer cell immunoglobulin-like receptors (KIRs) can dictate natural killer (NK) cell function 29, which is further regulated by HLA leader peptides loaded to HLA-E 30,31. These intertwined properties essential to both innate and adaptive immunity inevitably complicate the analysis of individual HLA alleles and certain functionally relevant residues or motifs shared by different alleles. When individual HLA-I alleles are compared, new findings (Table 1) continue to support the notion that HLA-A and HLA-C alleles are less prominent than HLA-B alleles Specifically, studies have readily recognized HLA-B*13, B*14, B*18, B*27, B*35, B*44, B*45, B*53, B*57, B*58:01, B*58:02 and B*81 as distinct correlates of HIV-1 VL in several cohorts from Africa and North America Evidence for three HLA-A alleles (A*32, A*36, and A*74), two HLA-C alleles (C*08 and C*18), and one combination (HLA-A*30+HLA-C*03) is rather consistent with earlier observations, with HLA-A*74 being favorable (low VL) in native Africans and African-Americans 34,35, Linkage disequilibrium (LD) between HLA-A*74 and HLA-B*57 may obscure the analysis of the former, but an independent contribution by A*74 was evident in a large sample size 39. HLA-C*18 as a favorable allele needs further assessment, as it apparently tags two

41 27 favorable HLA-B alleles, B*57:03 and B*81:01 34,40. The HLA-C*12-HLA-B*39 haplotype is another example of neighboring alleles that are hard to separate 33,35. For HLA-II (Table 1), only two alleles have shown appreciable impact on setpoint VL: HLA-DRB1*01:02 and HLA-DRB1*13:03 are associated with relatively high and low VL, respectively 34,41. Of note, HLA-DRB1*01:02 was associated with high VL in a combined cohort of seroconverting patients (SCs) and seroprevalent patients (SPs) or in SPs alone 34. In theory, SCs are more suitable for association analyses as few viral mutations are seen in early infection when set-point VL is measured. The relatively late effect of HLA-DRB1*01:02 (if confirmed) may reflect the delayed onset of high-affinity antibody responses mediated by HLA-II products. On the other hand, HLA-DRB1*13:03 is in moderate LD with HLA-B*57, but its association with low VL remained clear even when patients with HLA-B*57 were excluded 41. When the mature HLA-B protein forms are inferred from HLA-B genotyping results, three amino acid residues at positions 67, 70, and 97 (Met 67, Ser 70 and Val 97 around the C and F pockets) seem to explain alleles (e.g.; B*57) associated with favorable outcomes (HIV-1 control) 42. In African Americans, nonsynonymous single nucleotide polymorphism (SNPs) corresponding to HLA-B amino acid positions 63, 97, and 116 account for much of the effects attributable to the HLA-B locus 43. However, HLA-B*44 alleles (Ser 67, Asn 70 and Arg 97 ) that are favorable in native Africans did not conform to this newly recognized rule 36. Similarity or difference in peptide-binding preferences alone may not fully capture the spectrum of concerted and evolving immune function that is essential to durable containment of HIV-1 infection 44.

42 28 Specific alleles and codon positions aside, HLA-I homozygosity (lack of diversity) has what appears to be an additive effect on set-point VL 33,35 (Table 1), probably by allowing rapid viral immune escape. Homozygosity is mostly restricted to common HLA-I alleles found in a given population, so its disadvantage may alternatively imply the advantage of rare or infrequent alleles to which viral adaptation is less likely to occur 45. This concept of allele frequency-dependent influences on HIV-1 pathogenesis deserves further evaluation 35, Killer Cell Immunoglobulin like Receptor (KIR) Genes KIR gene products are primarily expressed on NK cells to inhibit or activate cytotoxic activities, depending on the combination of receptor-ligand (HLA-B or HLA- C) pairing Just like their HLA ligands, KIR molecules are highly polymorphic in terms of gene content and allelic diversity. In the presence of HLA-B ligand Bw4-80I, the activating KIR3DS1 and inhibitory KIR3DL1 may delay HIV-1 disease progression (time to AIDS or death) 46,49. The specific role of KIR-HLA interaction in the early course of HIV-1 infection is not obvious 50. New evidence now suggests that KIR3DS1 copy number variation is worth noting (Table 1). When HLA-Bw4-80I is present, one or more copies of KIR3DS1 was associated with relatively low set-point VL even after statistical adjustments for other known factors in the KIR-HLA interaction pathway, including HLA-B*57, B*27, and B*35Px 51. Two other recent studies found no association between KIR3DL1, KIR3DS1, or KIR2DS4 and VL 47,52. Differences in methodology and KIR3DS1 population frequencies may account for the lack of immediate consensus.

43 Chemokine Receptors and Ligand Genes Several chemokine receptors, especially CCR5 and CCR2, act as HIV-1 coreceptors that facilitate viral entry into target cells. Neighboring CCR2 and CCR5 gene variants (haplotypes and diplotypes) have well-known relationships to HIV-1 transmission (initiation of infection) 53, but their role in established infection is not persuasive 25,54. Heterozygosity for the 32-base-pair deletion in the CCR5 gene open reading frame is of epidemiological importance to various populations 54-57, so is the amino acid substitution of valine to isoleucine at CCR2 residue 64 (64V/I). The CCR2- CCR5 haplotypes tagged by CCR5-Δ32 (HHG*2) and CCR2-64I (HHF*2) may act in concert to influence set-point VL in populations of European ancestry 54, but that combination (HHF*2/HHG*2) is too rare in other racial groups to be a worthy topic. Further work on various genes encoding CCR5 ligands (MIP-1α, MIP-1β, and CCL5/RANTES) often leads to inconsistent or conflicting observations 58. Investigation of chemokine receptor and ligand genes is still active (Table 1). Translation of CCR5-Δ32 to low set-point VL has gained new supporting evidence 59. Modest advantage was also seen with HHF*2 homozygosity 60. The HHD/HHE diplotype commonly seen in cohorts of African ancestry appeared to be unfavorable 60. More recently, the minor allele C for SNP rs (in the CCL3 gene) has been associated with low set-point VL 61, with a low probability of false discovery from multiple testing. 2.4 Other Miscellaneous Observations Based on Candidate Gene Approach One study has revealed that DC-SIGNR (CD209L) genotypes can be associated with HIV-1 VL: the alleles encoding 7-repeat and 9-repeat isoforms appear to be unfavorable 62 (Table 1). The number of 23-amino acid repeats in the DC-SIGNR protein

44 30 ranges from three to nine 63, and the reported associations can be attributed to two isoform combinations, 5/7 and 7/9. Biologically, DC-SIGNR and DC-SIGN are transmembrane receptors on dendritic cells that help ferry HIV-1 virions to tissues enriched with CD4 + T-cells 64. Earlier work has shown some evidence about a possible distinction between the seven- or nine-repeat isoforms and others Results from Genome-wide Association Studies (GWAS) GWAS provide a hypothesis-free approach to identifying genes or SNPs of epidemiological importance. Multiple GWAS have consistently pointed to two SNPs as markers of effective immune control during HIV-1 infection. The first, rs , is mapped to the HCP5 pseudogene. The second, rs , is located about 35-kb upstream of HLA-C In Caucasians, these SNPs effectively tag HLA-B*57:01 and a microrna target site polymorphism in HLA-C 3 untranslated region (UTR), respectively 65,73. Other HLA-I alleles can be involved as well 67,68,74,75. Variants defined by rs and rs are highlighted in two new studies 59,76 (Table 1). Separate analysis of SCs and SPs is considered useful as the effect sizes for many individual SNPs can vary greatly between SCs and SPs 76. Two other GWAS based on African-Americans and native Africans failed to identify any SNPs with genome-wide statistical significance 77,78. In the African-American cohort, the top 10 SNPs of interest are all within the human major histocompatibility complex (MHC) 77. The SNP (rs ) with the best association signal (Table 1) is actually in LD with HLA-B*57:03 (a favorable allele). In analysis of native Africans 78, the number one SNP of interest (rs ) is beyond the MHC region (Table 1).

45 31 3. Viral Genetics and HIV-1 Set-Point VL 3.1 HIV-1 Genotype Epidemiologists and virologists are acutely aware of the evidence that defective viruses might partially explain spontaneous HIV-1 control, as seen in the strings of patients infected by a single blood donor in Sydney, Australia 79,80. The ability of such viruses to cause sexual transmission (an inefficient process) is unclear, but recent analyses of 134 native Africans with sexually transmitted primary HIV-1 infection 36 did reveal that acute-phase VL can be low (<2,000 copies/ml) in a small proportion (~6.7%) of SCs. Direct experimental evidence is still elusive as infectious viruses are hard to recover from these subjects. Conversely, however, SCs with set-point VL below 50 copies/ml can have measurable acute-phase VL (>10,000 copies/ml) 36. Other investigators have also come across rare cases where elite control was possible even when highly pathogenic viruses from clinical AIDS patients were transmitted 81. Following and verifying HIV-1 transmission chains are not easily done, but the assessment of donor and recipient VL can be useful 82. New results from analyses of linked transmission pairs (Table 2) support a modest linear relationship between donor VL (chronic) and recipient set-point VL 83. In a second study, genetic distances between viral sequences correlate with differences in VL 84, suggesting that viral genotypes should be considered during the search for quantitative trait loci. 3.2 Interaction of Host and Viral Genetic Factors To properly dissect out factors (host or viral) with the greatest influence on HIV-1 evolution and VL, models will need to simultaneously consider host and viral dynamics 82,83,85. Among three closely related HLA-B allelic products examined in this context 86,

46 32 HLA-B*57:03 appears to target four p24 Gag epitopes (ISW9, KF11, TW10, and QW9), but HLA-B*57:02 and HLA-B*58:01 only target three and two of them, respectively 86. Conceivably, these allelic forms can impose differential selection pressure on the viral genome. In the end, the causal factors of viral fitness can lie in the host and in the transmitted virus. 4. Methodological Challenges 4.1 Variations in Calculation of Set-Point VL Despite its wide use, there is still no standard method for determining HIV-1 setpoint VL, with multiple methods having been used rather randomly 87. When a single RNA measurement is treated as the set-point 16,67,88, the timing of such measurement can vary greatly: (i) the visit after the first seropositive visit, (ii) visit at least three months after the estimated date of infection (EDI), (iii) visit at least six months after the EDI. Others prefer to use data from several visits 12,89, in favor of methods that calculate the VL phenotype as the average or as the median of multiple VL points within specific intervals of infection 87. Those with more advanced statistical skills simply test repeated VL measurements in mixed models 36,74, but asymmetry in data structure (total visits and visit intervals) can be an issue. Decision to exclude patients with insufficient data can be a sticky business. 4.2 Early Chronic Phase Versus Chronic Phase Viral adaptation to the host microenvironment, including protective immune responses, is a gradual process. VLs taken during the early and later course of infection can possess similar traits for very different reasons 24,34,76, so findings are not directly

47 33 comparable when the duration of infection is unknown. As most studies have already missed the early course of infection 57, the literature is likely most relevant to chronic infection when opportunistic infections (OIs) may complicate the analysis The OIs can be disparate in exposure, tissue compartments, and geography, but they are rarely captured in analysis of HIV-1 VL readouts. 4.3 Changes in Set-Point VL over Calendar Time Several studies have noted an increase in set-point VL over time 59,90-92, while others disagree A large meta-analysis pooling results from prior studies of seroincident patients found a trend for a rising VL set-point over time 96. Assuming that widespread viral adaptation does occur 59,96,97, one can envision that the timing of the AIDS epidemic in different populations can be critical. In an European population, pre set-point VL appeared to differ from post-2003 VL in SCs 91, accompanied by a loss of host genetic advantage conferred by CCR5-Δ32 and other prominent factors (e.g.; rs /b*57:01) 59. Likewise, patients with HLA-B*51 before and after 2001 differed in their VLs 98, which is consistent with the hypothesis that specific CTL escape mutations induced by HLA-mediated immunity can reach fixation when these mutations have no apparent fitness costs 99. Finding the tipping point for adapted versus unadapted viruses in each population is obviously another sticky business. 4.4 Other Potential Confounders Cofactors not routinely considered in analysis of HIV-1 VL can be quite obvious. For example, age and gender are known to influence VL 37, but they are infrequently seen in reported statistical models. Other less obvious factors can range from viral subtypes and its segregation with certain racial backgrounds 100,101 to differential distribution of

48 34 important genetic variations (e.g.; CCR5-Δ32 and HLA-B*27) or the techniques used for defining them. Future studies will clearly need to apply multivariable models to carefully consider covariates and potential confounders. Composite scores based on all known factors may offer a temporary solution to simplifying the data analysis process 13,102, although individual factors may not have equally additive effects on HIV-1 VL. 5. Conclusions HIV-1 viremia is an informative quantitative trait that varies at the individual and population levels. While many studies have attempted to sort out the quantitative trait loci, lack of clear consensus hints at various problems with study design and data analysis. Factors important to VL can lie in the host and viral genomes. As viral evolution shaped by host immune responses become more and more predictable, fine-mapping of viral and host genetics can begin to allow a fair assessment of primary and secondary factors for transformative research. In other words, an open-minded research question is not whether host factors predominate over viral factors or vice versa, the two are so intertwined that their constant interactions in distinct individuals and populations collectively dictate the landscape of viral dynamics. The ultimate challenge (and goal) is to properly integrate comprehensive data on host and viral characteristics. The need for such approach is urgent, as datasets generated by high-throughput techniques will become overwhelmingly complex.

49 35 Acknowledgements This review was done as part of ongoing research supported by two grants, R01- AI and R01-AI064060, from National Institute of Allergy and Infectious Diseases. References 1. Gottlieb GS, Heath L, Nickle DC, et al. HIV-1 variation before seroconversion in men who have sex with men: analysis of acute/early HIV infection in the multicenter AIDS cohort study. J Infect Dis. 2008;197(7): Keele BF, Giorgi EE, Salazar-Gonzalez JF, et al. Identification and characterization of transmitted and early founder virus envelopes in primary HIV- 1 infection. Proc Natl Acad Sci U S A. 2008;105(21): Abrahams MR, Anderson JA, Giorgi EE, et al. Quantitating the multiplicity of infection with human immunodeficiency virus type 1 subtype C reveals a nonpoisson distribution of transmitted variants. J Virol. 2009;83: Haaland RE, Hawkins PA, Salazar-Gonzalez J, et al. Inflammatory genital infections mitigate a severe genetic bottleneck in heterosexual transmission of subtype A and C HIV-1. PLoS Pathog. 2009;5(1):e Kearney M, Maldarelli F, Shao W, et al. Human immunodeficiency virus type 1 population genetics and adaptation in newly infected individuals. J Virol. 2009;83(6): Walker BD, Korber BT. Immune control of HIV: the obstacles of HLA and viral diversity. Nat Immunol. 2001;2(6): Korber B, Muldoon M, Theiler J, et al. Timing the ancestor of the HIV-1 pandemic strains. Science. Jun ;288(5472): Tebit DM, Arts EJ. Tracking a century of global expansion and evolution of HIV to drive understanding and to combat disease. Lancet Infect Dis. Jan 2011;11(1):45-56.

50 36 9. Boutwell CL, Rolland MM, Herbeck JT, Mullins JI, Allen TM. Viral evolution and escape during acute HIV-1 infection. J Infect Dis. 2010;202(Suppl 2):S Mellors JW, Rinaldo CR Jr, Gupta P, White RM, Todd JA, Kingsley LA. Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science. 1996;272(5265): Mellors JW, Muñoz A, Giorgi JV, et al. Plasma viral load and CD4+ lymphocytes as prognostic markers of HIV-1 infection. Ann Intern Med. 1997;126(12): de Wolf F, Spijkerman I, Schellekens PT, et al. AIDS prognosis based on HIV-1 RNA, CD4+ T-cell count and function: markers with reciprocal predictive value over time after seroconversion. AIDS. 1997;11(15): Mann DL, Garner RP, Dayoff DE, et al. Major histocompatibility complex genotype is associated with disease progression and virus load levels in a cohort of human immunodeficiency virus type 1-infected Caucasians and African Americans. J Infect Dis. 1998;178(6): Lyles RH, Muñoz A, Yamashita TE, et al. Natural history of human immunodeficiency virus type 1 viremia after seroconversion and proximal to AIDS in a large cohort of homosexual men. Multicenter AIDS Cohort Study. J Infect Dis. 2000;181(3): Gottlieb GS, Sow PS, Hawes SE, et al. Equal plasma viral loads predict a similar rate of CD4+ T cell decline in human immunodeficiency virus (HIV) type 1- and HIV-2-infected individuals from Senegal, West Africa. J Infect Dis. 2002;185(7): Mellors JW, Margolick JB, Phair JP, et al. Prognostic value of HIV-1 RNA, CD4 cell count, and CD4 cell count slope for progression to AIDS and death in untreated HIV-1 infection. JAMA. 2007;297(21): Lavreys L, Baeten JM, Chohan V, et al. Higher set point plasma viral load and more-severe acute HIV type 1 (HIV-1) illness predict mortality among high-risk HIV-1-infected African women. Clin Infect Dis. 2006;42(9):

51 Quinn TC, Wawer MJ, Sewankambo N, et al. Viral load and heterosexual transmission of human immunodeficiency virus type 1. Rakai Project Study Group. N Engl J Med. 2000;342(13): Fideli US, Allen SA, Musonda R, et al. Virologic and immunologic determinants of heterosexual transmission of human immunodeficiency virus type 1 in Africa. AIDS Res Hum Retroviruses. 2001;17(10): Daar ES, Moudgil T, Meyer RD, Ho DD. Transient high levels of viremia in patients with primary HIV-1 infection. N Engl J Med. 1991;324(14): Geskus RB, Prins M, Hubert JB, et al. The HIV RNA setpoint theory revisited. Retrovirology. 2007;4: National Institute of Allergy and Infectious Diseases. What are HIV and AIDS? HIV/AIDS 2008; DS.aspx. Accessed November 27, Egger M, May M, Chêne G, et al. Prognosis of HIV-1-infected patients starting highly active antiretroviral therapy: a collaborative analysis of prospective studies. Lancet. 2002;360(9327): Langford SE, Ananworanich J, Cooper DA. Predictors of disease progression in HIV infection: a review. AIDS Res Ther. 2007;4: Tang J, Kaslow RA. The impact of host genetics on HIV infection and disease progression in the era of highly active antiretroviral therapy. AIDS. 2003;17(Suppl 4):S51-S Carrington M, O'Brien SJ. The influence of HLA genotype on AIDS. Annu Rev Med. 2003;54: Streeck H, Jolin JS, Qi Y, et al. Human immunodeficiency virus type 1-specific CD8+ T-cell responses during primary infection are major determinants of the viral set point and loss of CD4+ T cells. J Virol. 2009;83(15): Horton R, Wilming L, Rand V, et al. Gene map of the extended human MHC. Nat Rev Genet. Dec 2004;5(12):

52 Carrington M, Martin MP, van Bergen J. KIR-HLA intercourse in HIV disease. Trends Microbiol. Dec 2008;16(12): O'Callaghan CA, Bell JI. Structure and function of the human MHC class Ib molecules HLA-E, HLA-F and HLA-G. Immunol Rev. Jun 1998;163: Yunis EJ, Romero V, Diaz-Giffero F, Zuniga J, Koka P. Natural killer cell receptor NKG2A/HLA-E interaction dependent differential thymopoiesis of hematopoietic progenitor cells influences the outcome of HIV infection. J Stem Cells. 2007;2(4): Gao X, O'Brien TR, Welzel TM, et al. HLA-B alleles associate consistently with HIV heterosexual transmission, viral load, and progression to AIDS, but not to susceptability to infection. AIDS. 2010;24(12): Leslie A, Matthews PC, Listgarten J, et al. Additive contribution of HLA class I alleles in the immune control of HIV-1 infection. J Virol. 2010;84(19): Tang J, Malhotra R, Song W, et al. Human leukocyte antigens and HIV type 1 viral load in early and chronic infection: predominance of evolving relationships. PLoS One. 2010;5(3):e Lazaryan A, Song W, Lobashevsky E, et al. The influence of human leukocyte antigen class I alleles and their population frequencies on human immunodeficiency virus type 1 control among African Americans. Hum Immunol. 2011;72(4): Tang J, Cormier E, Gilmour J, et al. Human leukocyte antigen variants B*44 and B*57 are consistently favorable during two distinct phases of primary HIV-1 infection in sub-saharan Africans with several viral subtypes. J Virol. 2011;85(17): Tang J, Tang S, Lobashevsky E, et al. Favorable and unfavorable HLA class I alleles and haplotypes in Zambians predominantly infected with clade C human immunodeficiency virus type 1. J Virol. 2002;76(16): Trachtenberg E, Korber B, Sollars C, et al. Advantage of rare HLA supertype in HIV disease progression. Nat Med. 2003;9(7):

53 Matthews PC, Adland E, Listgarten J, et al. HLA-A*7401-mediated control of HIV viremia is independent of its linkage disequlibrium with HLA-B*5703. J Immunol. 2011;186(10): Tang J, Shao W, Yoo YJ, et al. Human leukocyte antigen class I genotyes in relation to heterosexual HIV type 1 transmission within discordant couples. J Immunol. 2008;181(4): Julg B, Moodley ES, Qi Y, et al. Possession of HLA class II DRB1*1303 associates with reduced viral loads in chronic HIV-1 clade C and B infection. J Infect Dis. 2011;203(6): Pereyra F, Jia X, McLaren PJ, et al. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science. 2010;330(6010): McLaren PJ, Ripke S, Pelak K, et al. Fine-mapping classical HLA variation associated with durable host control of HIV-1 infection in African Americans. Hum Mol Genet. 2012;21(19): Koup RA, Graham BS, Douek DC. The quest for a T cell-based immune correlate of protection against HIV: a story of trials and errors. Nat Rev Immunol. Jan 2011;11(1): Rousseau CM, Daniels MG, Carlson JM, et al. HLA class I-driven evolution of human immunodeficiency virus type 1 subtype c proteome: immune escape and viral load. J Virol. 2008;82(13): Qi Y, Martin MP, Gao X, et al. KIR/HLA pleiotropism: protection against both HIV and opportunistic infections. PLoS Pathog. 2006;2(8):e Merino A, Malhotra R, Morton M, et al. Impact of a functional KIR2DS4 allele on heterosexual HIV-1 transmission among discordant Zambian couples. J Infect Dis. 2011;203(4): Vilches C, Parham P. KIR: diverse, rapidly evolving receptors of innate and adaptive immunity. Annu Rev Immunol. 2002;20:

54 Martin MP, Qi Y, Gao X, et al. Innate partnership of HLA-B and KIR3DL1 subtypes against HIV-1. Nat Genet. 2007;39(6): Gaudieri S, DeSantis D, McKinnon E, et al. Killer immunoglobulin-like receptors and HLA act both independently and synergistically to modify HIV disease progression. Genes Immun. 2005;6(8): Pelak K, Need AC, Fellay J, et al. Copy number variation of KIR genes influences HIV-1 control. PLoS Biol. 2011;9(11):e Silva EM, Acosta AX, Santos EJ, et al. HLA-Bw4-B*57 and Cw*18 alleles are associated with plasma viral load modulation in HIV-1 infected individuals in Salvador, Brazil. Braz J Infect Dis. 2010;14(5): Kaslow RA, Dorak T, Tang JJ. Influence of host genetic variation on susceptibility to HIV type 1 infection. J Infect Dis. 2005;191(Suppl 1):S Tang J, Shelton B, Makhatadze NJ, et al. Distribution of chemokine receptor CCR2 and CCR5 genotypes and their relative contribution to human immunodeficiency virus type 1 (HIV-1) seroconversion, early HIV-1 RNA concentration in plasma, and later disease progression. J Virol. 2002;76(2): Katzenstein TL, Eugen-Olsen J, Hofmann B, et al. HIV-infected individuals with the CCR delta32/ccr5 genotype have lower HIV RNA levels and higher CD4 cell counts in the early years of the infection than do patients with the wild type. Copenhagen AIDS Cohort Study Group. J Acquir Immune Defic Syndr Hum Retrovirol. 1997;16(1): Meyer L, Magierowska M, Hubert JB, et al. Early protective effect of CCR-5 delta 32 heterozygosity on HIV-1 disease progression: relationship with viral load. The SEROCO Study Group. AIDS. 1997;11(11):F Ioannidis JP, Rosenberg PS, Goedert JJ, et al. Effects of CCR5-Delta32, CCR2-64I, and SDF-1 3'A alleles on HIV-1 disease progression: An international metaanalysis of individual-patient data. Ann Intern Med. 2001;135(9): Shrestha S, Tang J, Kaslow RA. Gene copy number: learning to count past two. Nat Med. Oct 2009;15(10):

55 van Manen D, Gras L, Boeser-Nunnink BD, et al. Rising HIV-1 viral load set point at a population level coincides with a fading impact of host genetic factors on HIV-1 control. AIDS. 2011;25(18): Malhotra R, Hu L, Song W, et al. Association of chemokine receptor gene (CCR2-CCR5) haplotypes with acquisition and control of HIV-1 infection in Zambians. Retrovirology. 2011;8: Hu L, Song W, Brill I, et al. Genetic variations and heterosexual HIV-1 infection: analysis of clustered genes encoding CC-motif chemokine ligands. Genes Immun. 2012;13(2): Xu L, Li Q, Ye H, et al. The nine-repeat DC-SIGNR isoform is associated with increased HIV-RNA loads and HIV sexual transmission. J Clin Immunol. May 2010;30(3): Liu H, Carrington M, Wang C, et al. Repeat-region polymorphisms in the gene for the dendritic cell-specific intercellular adhesion molecule-3-grabbing nonintegrin-related molecule: effects on HIV-1 susceptibility. J Infect Dis. Mar ;193(5): Geijtenbeek TB, van Duijnhoven GC, van Vliet SJ, et al. Identification of different binding sites in the dendritic cell-specific receptor DC-SIGN for intercellular adhesion molecule 3 and HIV-1. J Biol Chem. 2002;277(13): Fellay J, Shianna KV, Ge D, et al. A whole-genome association study of major determinants for host control of HIV-1. Science. 2007;317(5840): Dalmasso C, Carpentier W, Meyer L, et al. Distinct genetic loci control plasma HIV-RNA and cellular HIV-DNA levels in HIV-1 infection: the ANRS Genome Wide Association 01 study. PLoS One. 2008;3(12):e Catano G, Kulkarni H, He W, et al. HIV-1 disease-influencing effects associated with ZNRD1, HCP5 and HLA-C alleles are attributable mainly to either HLA- A10 or HLA-B*57 alleles. PLoS One. 2008;3(11):e Fellay J, Ge D, Shianna KV, et al. Common genetic variation and the control of HIV-1 in humans. PLoS Genet. 2009;5(12):e

56 Thomas R, Apps R, Qi Y, et al. HLA-C cell surface expression and control of HIV/AIDS correlate with a variant upstream of HLA-C. Nat Genet. 2009;41(12): van Manen D, Kootstra NA, Boeser-Nunnink B, Handulle MAM, van't Wout AB, Schuitemaker H. Association of HLA-C and HCP5 gene regions with the clinical course of HIV-1 infection. AIDS. 2009;23(1): Guergnon J, Theodorou I. What did we learn on host's genetics by studying large cohorts of HIV-1-infected patients in the genome-wide association era? Curr Opin HIV AIDS. 2011;6(4): Aouizerat BE, Pearce CL, Miaskowski C. The search for host genetic factors of HIV/AIDS pathogenesis in the post-genome era: progress to date and new avenues for discovery. Curr HIV/AIDS Rep. 2011;8(1): Kulkarni S, Savan R, Qi Y, et al. Differential microrna regulation of HLA-C expression and its association with HIV control. Nature. 2011;472(7344): Shrestha S, Aissani B, Song W, Wilson CM, Kaslow RA, Tang J. Host genetics and HIV-1 viral load set-point in African-Americans. AIDS. 2009;23(6): Trachtenberg E, Bhattacharya T, Ladner M, Phair J, Erlich H, Wolinsky S. The HLA-B/C haplotype block contains major determinants for host control of HIV. Genes Immun. 2009;10(8): Evangelou E, Fellay J, Colombo S, et al. Impact of phenotype definition on genome-wide association signals: empirical evaluation in human immunodeficiency virus type 1 infection. Am J Epidemiol. 2011;173(11): Pelak K, Goldstein DB, Walley NM, et al. Host determinants of HIV-1 control in African Americans. J Infect Dis. 2010;201(8): Lingappa JR, Petrovski S, Kahle E, et al. Genomewide association study for determinants of HIV-1 acquisition and viral set point in HIV-1 serodiscordant couples with quantified virus exposure. PLoS One. 2011;6(12):e28632.

57 Learmont JC, Geczy AF, Mills J, et al. Immunologic and virologic status after 14 to 18 years of infection with an attenuated strain of HIV-1. A report from the Sydney Blood Bank Cohort. N Engl J Med. Jun ;340(22): Zaunders J, Dyer WB, Churchill M. The Sydney Blood Bank Cohort: implications for viral fitness as a cause of elite control. Curr Opin HIV AIDS. May 2011;6(3): Bailey JR, O'Connell K, Yang HC, et al. Transmission of human immunodeficiency virus type 1 from a patient who developed AIDS to an elite suppressor. J Virol. Aug 2008;82(15): Tang J, Tang S, Lobashevsky E, et al. HLA allele sharing and HIV type 1 viremia in seroconverting Zambians with known transmitting partners. AIDS Res Hum Retroviruses. 2004;20(1): Hollingsworth TD, Laeyendecker O, Shirreff G, et al. HIV-1 transmitting couples have similar viral load set-points in Rakai, Uganda. PLoS Pathog. 2010;6(5):e Alizon S, von Wyl V, Stadler T, et al. Phylogenetic approach reveals that virus genotype largely determines HIV set-point viral load. PLoS Pathog. 2010;6(9):e Novitsky V, Gilbert P, Peter T, et al. Association between virus-specific T-cell responses and plasma viral load in human immunodeficiency virus type 1 subtype C infection. J Virol. 2003;77(2): Kloverpris HN, Stryhn A, Harndahl M, et al. HLA-B*57 micropolymorphism shapes HLA allele-specific epitope immunogenicity, selection pressure, and HIV immune control. J Virol. 2012;86(2): Mei Y, Wang L, Holte SE. A comparison of methods for determining HIV viral set point. Stat Med. 2008;27(1): Mellors JW, Kingsley LA, Rinaldo CR Jr, et al. Quantitation of HIV-1 RNA in plasma predicts outcome after seroconversion. Ann Intern Med. 1995;122(8):

58 Hubert JB, Burgard M, Dussaix E, et al. Natural history of serum HIV-1 RNA levels in 330 patients with a known date of infection. The SEROCO Study Group. AIDS. 2000;14(2): Dorrucci M, Rezza G, Porter K, Phillips A, Concerted Action on Seroconversion to AIDS and Death in Europe Collaboration. Temporal trends in postseroconversion CD4 cell count and HIV load: the Concerted Action on Seroconversion to AIDS and Death in Europe Collaboration, J Infect Dis. 2007;195(4): Gras L, Jurriaans S, Bakker M, et al. Viral load levels measured at set-point have risen over the last decade of the HIV epidemic in the Netherlands. PLoS One. 2009;4(10):e Müller V, Maggiolo F, Suter F, et al. Increasing clinical virulence in two decades of the Italian HIV epidemic. PLoS One. 2009;5(5):e Müller V, Ledergerber B, Perrin L, et al. Stable virulence levels in the HIV epidemic of Switzerland over two decades. AIDS. 2006;20(6): Potard V, Weiss L, Lamontagne F, et al. Trends in post-infection CD4 cell counts and plasma HIV-1 RNA levels in HIV-1-infected patients in France between 1997 and J Acquir Immune Defic Syndr. 2009;52(3): Troude P, Chaix ML, Tran L, et al. No evidence of a change in HIV-1 virulence since 1996 in France. AIDS. 2009;23(10): Herbeck JT, Muller V, Maust BS, et al. Is the virulence of HIV changing? A meta-analysis of trends in prognostic markers of HIV disease progression and transmission. AIDS. 2012;26(2): Moore CB, John M, James IR, Christiansen FT, Witt CS, Mallal S. Evidence of HIV-1 adaptation to HLA-restricted immune responses at a population level. Science. 2002;296(5572): Koga M, Kawana-Tachikawa A, Heckerman D, et al. Changes in impact of HLA class I allele expression on HIV-1 plasma virus loads at a population level over time. Microbiol Immunol. 2010;54(4):

59 Kawashima Y, Pfafferott K, Frater J, et al. Adaptation of HIV-1 to human leukocyte antigen class I. Nature. 2009;458(7238): Peeters M. The genetic variability of HIV-1 and its implications. Transfus Clin Biol. 2001;8(3): Eberle J, Gurtler L. HIV types, groups, subtypes and recombinant forms: errors in replication, selection pressure and quasispecies. Intervirology. 2012;55(2): Saah AJ, Hoover DR, Weng S, et al. Association of HLA profiles with early plasma viral load, CD4+ cell count and rate of progression to AIDS following acute HIV-1 infection. Multicenter AIDS Cohort Study. AIDS. 1998;12(16):

60 46 FIGURE 1. Selection of recent (post-2010) publications for systematic review. Two rounds of searches in PubMed yield 2,660 original research articles that contain three key words (HIV, viral load, and host or viral genome). Only 22 of these meet the criteria for full evaluation here (20 in Table 1 and two in Table 2). Articles printed in English journals since January 2010 (N = 2,660) Set point VL as the phenotype of interest (n = 50) Adult populations only (n = 47 remaining) Sample size greater than 100 (n = 25 excluded) Treatment-naïve patients without AIDS-related co-infections (n = 22, final selection)

61 47 TABLE 1. Host genetic factors that are positively or negatively associated with HIV-1 viral load (VL) set-point or assumed set-point, as reported in recent studies. a Gene or gene cluster b Allele or haplotype c Ethnicity d Impact on VL Refs Classical HLA class I genes: HLA-A, HLA-B, and HLA-C A*32 AA Favorable A*36 African Unfavorable A*74 AA, African Favorable B*13 African Favorable B*14 AA Favorable B*18 African Unfavorable B*27 Caucasian Favorable B*35 Caucasian Unfavorable B*44 African Favorable B*45 AA, African Unfavorable B*53 AA Unfavorable B*57 AA, African, Caucasian Favorable B*58:01 African Favorable B*58:02 African Unfavorable B*81 African Favorable C*08 African Favorable C*18 African Favorable A*30+C*03 African Favorable C*04:01-B*81:01 African Favorable C*12-B*39 African Favorable Homozygosity AA and African Unfavorable HLA-DRB1 DRB1*01:02 African Unfavorable Killer cell immunoglobulin-like receptor (KIR) genes DRB1*13:03 African Favorable KIR3DS1 copy no. Caucasian Favorable if 1 copy KIR3DL1 copy no. Caucasian Favorable if 1 copy CCR5 Δ32 heterozygosity Caucasian Favorable CCR2-CCR5 HHD/HHE African Unfavorable HHF*2 homozygosity African Favorable CCL3 rs allele C African Favorable DC-SIGNR (CD209L) 7 or 9 repeats of a 69-bp Asian (Chinese) Unfavorable coding sequence rs , allele C Caucasian Favorable Miscellaneous loci (sporadic SNPs) rs , allele G Caucasian Favorable 59,76 a Four studies 47,51,52,78 with mostly negative results (not reaching statistical significance) are cited briefly in the text. b Organized by group and sorted by degree of popularity, i.e.; the number of studies meeting criteria (see Figure 1). c Variants in bold have shown consistency between studies conducted by different investigators. Certain amino acid residues may account for HLA-B allelic effects (e.g.; B*57 and B*81) 42,43, as discussed in the text. d AA=African American , , , , ,39,86 33,86 33, , ,35 33, ,76

62 48 TABLE 2. Viral markers that are associated with HIV-1 set-point viral load (VL), as reported in recent studies. Viral factor Measurement Impact on set-point VL Refs Heritability Transmission source partner (TSP) VL Genetic distance on phylogenetic tree TSP VL correlates with setpoint VL in linked recipients High heritability in set-point VL, from one infection to the next 83 84

63 49 GENETIC CORRELATES WITHIN THE HUMAN EXTENDED MAJOR HISTOCOMPATIBILITY COMPLEX AND HIV-1 ACQUISITION by HEATHER A. PRENTICE, NICHOLAS M. PAJEWSKI, KUI ZHANG, ELIZABETH E. BROWN, RICHARD A. KASLOW, AND JIANMING TANG In preparation for AIDS Format adapted for dissertation

64 50 ABSTRACT Although it is widely accepted that host genetic factors influence HIV-1 acquisition, only variants within the CCR5 gene have been confirmed in multiple investigations. Two genome-wide association studies within African cohorts failed to identify additional regions conferring resistance to infection; one meta-analysis has suggested CYP7B1 as a candidate gene for populations of European ancestry. In a search for polymorphisms associated with predisposition to or protection against acquisition of HIV-1 infection, here we conducted fine-mapping to examine variants within the extended major histocompatibility complex (MHC) in a cohort of Zambian HIV-1 serodiscordant heterosexual couples. We assessed 6,865 MHC SNPs in a population of 439 initially seronegative Zambian individuals of whom 212 seroconverted during follow-up. Cox proportional hazards regression was performed, adjusting for genetic and non-genetic factors previously identified with HIV-1 transmission in our Zambian cohort. No SNP was significantly associated with HIV-1 acquisition after adjustment for covariates and for multiple comparisons. This is the first study to examine associations of HIV-1 acquisition with extended MHC markers interrogated through the dense coverage SNP array provided by the ImmunoChip. Although we failed to identify any significant association, regions detected at significance levels below the set threshold may warrant further investigation in a secondary replication cohort. INTRODUCTION There is great inter-individual variability in risk for HIV-1 acquisition depending on the presence of a number of exposure risk factors proven to modify susceptibility,

65 51 including age, gender, history of sexually transmitted disease, route of transmission, and level of HIV-1 RNA in the plasma 1-6. Yet, even among individuals with similar levels of exposure susceptibility still varies, suggesting other host factors including human genetic variation may determine the risk of infection 7. Despite ongoing research, genetic determinants of susceptibility to HIV-1 infection remain elusive. Only variants within the CCR5 gene have been verified in multiple candidate gene studies to significantly alter risk for HIV-1 acquisition 8-16, with a 32 base-pair deletion in the promoter region the only mutation known to confer virtually complete resistance to infection. To date, genome-wide association studies (GWAS) in two separate African cohorts were unable to detect any variants significantly associated with acquisition 17,18. One genome-wide metaanalysis identified the T allele for rs , a SNP mapping between BHLHE22 and CYP7B1, as associated with protection across multiple Caucasian cohorts 19. It is important to continue searching for other human genetic factors that may influence acquisition, especially in African populations where CCR5Δ32 is absent and the risk for HIV-1 infection is greatest 20. Identification of genetic factors that modify acquisition may inform new strategies for preventive and therapeutic measures. One barrier to identification of novel host genetic associations with HIV-1 acquisition is in defining a suitable study cohort. To adequately study genetic determinants of HIV-1 acquisition, nongenetic factors that may alter risk between groups should be taken into account 5,18. Historically, individuals have only been recruited after infection has occurred, regardless of the stage of infection, making it difficult to retrospectively identify factors modulating acquisition 21. There is also difficulty in recruiting and retaining a large enough group of individuals who are amply exposed but

66 52 vary in the likelihood or ease of developing infection to supply sufficient statistical power 17. Longitudinal studies of heterosexual serodiscordant couples provide the opportunity to assess genetic correlates of HIV-1 transmission and acquisition in a well-defined cohort. Recent seroconverters (SCs) can be compared to individuals with known risk exposure who remain seronegative (ESNs); furthermore, viral linkage between partners can be confirmed, and both donor and recipient risk factors can be taken into account 5,18. A further limitation to reliance on GWAS to disclose new candidate markers in diverse populations has been that the early genotyping platforms were designed for use in populations of European ancestry. These genotyping arrays have only included a limited number of tagging variants in strong enough linkage disequilibrium (LD) with potential causal variants. Although the Illumina HumanHap 1M-Duo BeadChip 22 used in the two GWAS of HIV-1 acquisition in African cohorts included more coverage than other arrays, it was known to provide less coverage for non-european populations. The greater genetic diversity and shorter spans of LD in African populations were not as likely to be tagged by variants on this array 18,23. The Zambian cohort of serodiscordant couples is one of the largest cohorts available for longitudinal study of factors related to HIV-1 transmission and acquisition in Africans. Nongenetic risk factors related to both partners have been identified and verified over time, including HIV-1 RNA viral load (VL) in the index partner, lack of circumcision in male seronegative partners, and presence of genital ulcers or inflammation in either partner 24,25. In addition, candidate gene analyses of transmission by the index partner and acquisition by the initially seronegative partner have recognized potential genetic correlates for HIV-1 infection 16, Using the ImmunoChip

67 53 genotyping array, designed for fine-mapping of immune-related genes, we report here the association between variants within the extended major histocompatibility complex (MHC) region of the genome and risk for HIV-1 acquisition, while controlling for genetic and nongenetic factors previously reported within this Zambian cohort. MATERIALS AND METHODS Subjects. We studied a sample of 480 initially seronegative partners from heterosexual couples enrolled in the Zambia-Emory HIV Research Project (ZEHRP) cohort. The design and structure for this cohort has been described in detail previously 24,29,30. This study was approved by the institutional review board at Lusaka, Emory University, and the University of Alabama at Birmingham; all subjects gave written informed consent prior to participation. Subject selection. From the 480 individuals selected for our study, samples were excluded based on three quality control (QC) criteria: call rates <0.95, labeled duplicates included for internal QC checks, or failing sex determination. One sample corresponding to a labeled duplicate and nine samples with call rates below 95% were excluded. No samples were excluded based on ambiguities in sex determination. For pairs of individuals estimated to be third degree relatives or greater, we used the unrelated selection procedure described in Manichaikul et al. and implemented in the KING software package (10 individuals excluded) 31,32. Population stratification. Population stratification was assessed using multidimensional scaling (MDS) implemented in KING 32,33. We included data on 1,184 unrelated individuals from the eleven populations in Phase 3 of the International HapMap

68 54 Project. The HapMap 3 samples were genotyped on both the Illumina 1M and Affymetrix 6.0 genome-wide arrays. For the MDS analysis, we used a subset of single nucleotide polymorphisms (SNPs) (~30,000) that 1) were annotated with rsids on the ImmunoChip, 2) were outside of regions of known extended LD in European populations, and 3) could be reliably aligned with the HapMap 3 data (i.e. removing ambiguous A/T and C/G SNPs) 34. Applying these criteria, we excluded three outlying samples from further analysis. Acquisition assessment. Serodiscordant couples in the ZEHRP cohort were followed through quarterly visits until transmission occurred, they were lost to follow-up or initiated ART treatment, or the study ended. For couples where transmission did occur, the estimated date of infection (EDI) was calculated as the midpoint between the last seronegative visit and first seropositive visit. Upon serologic identification of HIV-1 transmission, confirmatory blood testing was completed for initially seronegative partners with positive HIV-1 serology. For these individuals with observed acquisition, epidemiological linkage of the virus was examined by sequence analysis and phylogenetic tree analysis for both partners 29. Individuals classified as epidemiologically unlinked were excluded from further analysis (18 samples). Individuals in this cohort were all antiretroviral therapy naïve, were predominantly infected with HIV-1 subtype C, and had at least nine months of follow-up. Viral load ascertainment. Plasma VL was measured as HIV-1 RNA (copies/ml) using the Roche Amplicor 1.0 assay (Roche Diagnostics Systems Inc., Branchburg, NJ) with a lower limit of detection of 400 copies/ml. The first log 10 transformed VL

69 55 measurement available in the index partner was used as the index VL. Measurements of VL below the lower limit of detection were assigned a value of (half log ). Genotyping. HLA class I and class II genotyping was completed to the first four digits using a combination of PCR-based methods including PCR with sequence-specific primers (SSP) (Dynal/Invirtrogen, Brown Deer, WI), automated sequence-specific oligonucleotide (SSO) probe hybridization (Innogenetics, Alpharetta, GA), and sequencing-based typing (SBT) (Abbott Molecular, Inc., Des Plaines, IL) using capillary electrophoresis and the ABI 3130xl DNA Analyzer (Applied Biosystems, Foster City, CA). KIR genotyping was completed by PCR with sequence-specific primers (PCR-SSP; Invitrogen). Genotypes on the Illumina ImmunoChip 35 were inferred using the joint calling and haplotype phasing algorithm implemented in BEAGLECALL 36. We completed a series of data cleaning and quality control procedures, excluding SNPs based on the following criteria: call rate <0.985, MAF <0.025, and deviating from HWE (p <10-6 ). Of the SNPs included on the ImmunoChip located within the extended MHC 37, 6,865 passed through quality control. Statistical analysis. Patient characteristics and covariates in transmitting partners were compared to non-transmitting partners by t test (continuous variables) or χ 2 (categorical variables). Single variants were tested in multivariable Cox proportional hazards models assuming additive effects and adjusting for known associated alleles in addition to gender, age at enrollment, methionine carriage at P2 in the signal peptide of an HLA-B allele (P2-Met) 28, and log 10 VL in the index partner. Additional variants adjusted for included HLA-A*68:02 in the initially seronegative partner and HLA-A*36

70 56 and KIR2DS4 in the index partner 25,27. GraphPad Prism (GraphPad Software, Inc.) was used to generate Kaplan-Meier curves to compare time-to-transmission for selected variant genotypes. Due to the large number of tests through multivariable models, both q-values using a FDR of 0.2 and the Bonferroni correction were utilized to account for multiple testing. To account for the widespread LD within the MHC, SimpleM was used to calculate the effective number of independent tests for the Bonferroni corrected alpha level 38,39. Based on the number of non-duplicated SNPs included on the ImmunoChip, the SimpleM estimate for the number of independent tests was 1,787; therefore a p value of 2.8x10-5 was considered statistically significant for testing in multivariable models. RESULTS Characteristics of initially seronegative individuals included in this study. Of the eligible sample of initially seronegative individuals, 439 were successfully genotyped and met all inclusion criteria; 212 individuals seroconverted during follow-up. The overall characteristics of SCs to ESNs are presented in Table 1. Consistent with previous observations in this cohort, SCs tended to be younger, female, possess at least one HLA- A*68:02 allele, and have P2-Met. Index partners of SCs also had a greater proportion of HLA-A*36, KIR2DS4, and higher mean log 10 VL. Previously reported statistically significant associations with HLA-A*68:02 and P2-Met carriage in the initially seronegative partner, as well as HLA-A*36, KIR2DS4, and log 10 VL in the index partner, were all confirmed through regression analysis (Supplemental Table 1). Prior reported associations with HLA-B*42-C*17, the HHD-

71 57 HHE haplotype, HLA class I score, and HLA-C*18 in the index partner were not replicated in this cohort (data not shown). Due to missing data, nine SCs and 38 ESNs were excluded from further analysis. There were no significant differences between SCs included for analysis and those excluded (Supplemental Table 2); more male ESNs were excluded due to missing data than females (35 males and 3 females, p <0.001) largely due to lack of P2-Met information. Extended MHC variants and HIV-1 acquisition. No SNP within the extended MHC reached significance when considering a threshold of p < 2.8x10-5 or an FDR < 0.2 in multivariable Cox proportional hazards models (Figure 1). In Table 2 we present a listing of all SNPs with p < 0.01 from additive genetic models. rs592625, in the 3 UTR of HLA-DOA, approached statistical significance after multivariable adjustment (HR=1.6, p=4.00x10-4 ). The rs g allele appears to be associated with accelerated HIV-1 acquisition. Using Kaplan-Meier curves, individuals with the rs g/g genotype had a significantly shorter median time-to-transmission compared to individuals with the A/G and A/A genotypes (log-rank p=0.002) (Figure 2). There were no significant differences between those with the A/G and A/A genotypes (log-rank p=0.087). In a supplemental investigation using dominant genetic models we found similar results to those found using additive genetic models (data not shown). DISCUSSION We assessed almost 7,000 SNPs within the extended MHC and failed to detect any statistically significant signals with HIV-1 acquisition. Although our analysis focused specifically on the extended MHC, these findings are consistent with two prior GWAS of

72 58 HIV-1 acquisition in African cohorts 17,18. This is the first study to directly evaluate time to HIV-1 acquisition for a longitudinally studied-cohort of exposed SCs with varying EDIs. This is also the first analysis of genetic determinants of acquisition to utilize Cox proportional hazards, which can be more powerful in capturing variants for a timedependent outcome like HIV-1 acquisition than analytic approaches typically applied to case-control studies. Although it did not reach statistical significance, the signal with rs in the 3 UTR region of the HLA-DOA gene encoding a HLA class II molecule is interesting. There has been little consistency with research on HLA class II molecules and HIV-1 outcomes, and what has been done has focused on HLA-DR and -DQ 40,41. HLA-DOA encodes an α heterodimer that, along with HLA-DOβ, is expressed in B-cell lysosomes and involved in regulating HLA-DM-mediated loading of short antigenic peptides to MHC class II molecules With its role in either enhancing or suppressing antigen peptide presentation, future investigation of alterations in expression HLA-DO molecules could provide a new avenue to explain some of the unexplained host genetic variation in HIV-1 acquisition. There is one important limitation in our present analysis. Complete data on risk factors for all subjects were not available to accurately determine exposure status over time (frequency of sex with and without a condom, genital ulcers or inflammation, circumcision) 5,45,46. Risk exposures are important determinants of transmission 5,47. ESNs who have actually had little HIV-1 exposure could be misclassified as resistant to infection, reducing power to detect genetic associations 18,48. However, we were able to account for VL in the index partner, which has been shown to have the greatest impact on

73 59 risk for transmission 4,24. In addition, this is the first investigation to include HLA variants previously shown to modulate transmission/acquisition in this cohort. The availability of those covariates enabled us to adjust at least partially for modifiers of HIV- 1 acquisition epidemiologically established in this population. Future refinements taking advantage of complete, detailed risk exposure information may help to further tease out new signals of host genetic influences on acquisition. In summary, this study aimed to capitalize on the fine-mapping conferred by the ImmunoChip, but it was unable to detect any variants significantly associated with alteration in the time to HIV-1 acquisition between SCs and ESNs. Further study in a larger population with complete risk factor information could potentially validate the relationship observed here with rs and increase the power to identify other statistically significant signals. ACKNOWLEDGMENTS This work was supported by multiple grants, including R01 AI (to E.H.), R37 AI (to E.H.), R01 AI (to R.A.K./J.T.) from the NIAID, UL1 RR from the Clinical Translational Science Award program, NCRR, Fogarty International Center D43 TW001042, and funding from the International AIDS Vaccine Initiative (IAVI) to S.A. This work was further supported by the IAVI Protocol C research network. REFERENCES 1. Royce RA, Seña A, Cates W Jr, Cohen MS. Sexual transmission of HIV. N Engl J Med. 1997;336(15):

74 60 2. Operskalski EA, Stram DO, Busch MP, et al. Role of viral load in heterosexual transmission of human immunodeficieny virus type 1 by blood transfusion recipients. Transfusion Safety Study Group. Am J Epidemiol. 1997;146(8): Long EM, Martin HL Jr, Kreiss JK, et al. Gender differences in HIV-1 diversity at time of infection. Nat Med. 2000;6(1): Quinn TC, Wawer MJ, Sewankambo N, et al. Viral load and heterosexual transmission of human immunodeficiency virus type 1. Rakai Project Study Group. N Engl J Med. 2000;342(13): Kaslow RA, Dorak T, Tang JJ. Influence of host genetic variation on susceptibility to HIV type 1 infection. J Infect Dis. 2005;191(Suppl 1):S Borrow P, Shattock RJ, Vyakarnam A, EUROPRISE Working Group. Innate immunity against HIV: a priority target for HIV prevention research. Retrovirology. 2010;7: Plummer FA, Ball TB, Kimani J, Fowke KR. Resistance to HIV-1 infection among highly exposed sex workers in Nairobi: what mediates protection and why does it develop? Immunol Lett. 1999;66(1-3): Dean M, Carrington M, Winkler C, et al. Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Hemophilia Growth and Development Study, Multicenter AIDS Cohort Study, Multicenter Hemophilia Cohort Study, San Francisco City Cohort, ALIVE Study. Science. 1996;273(5283): Huang Y, Paxton WA, Wolinsky SM, et al. The role of a mutant CCR5 allele in HIV-1 transmission and disease progression. Nat Med. 1996;2(11): Liu R, Paxton WA, Choe S, et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell. 1996;86(3): Michael NL, Chang G, Louie LG, et al. The role of viral phenotype and CCR-5 gene defects in HIV-1 transmission and disease progression. Nat Med. 1997;3(3):

75 Zimmerman PA, Buckler-White A, Alkhatib G, et al. Inherited resistance to HIV- 1 conferred by an inactivating mutation in CC chemokine receptor 5: studies in populations with contrasting clinical phenotypes, defined racial background, and quantified risk. Mol Med. 1997;3(1): Mann DL, Garner RP, Dayoff DE, et al. Major histocompatibility complex genotype is associated with disease progression and virus load levels in a cohort of human immunodeficiency virus type 1-infected Caucasians and African Americans. J Infect Dis. 1998;178(6): Martinson JJ, Hong L, Karanicolas R, Moore JP, Kostrikis LG. Global distribution of the CCR2-64I/CCR T HIV-1 disease-protective haplotype. AIDS. 2000;14(5): Tang J, Shelton B, Makhatadze NJ, et al. Distribution of chemokine receptor CCR2 and CCR5 genotypes and their relative contribution to human immunodeficiency virus type 1 (HIV-1) seroconversion, early HIV-1 RNA concentration in plasma, and later disease progression. J Virol. 2002;76(2): Malhotra R, Hu L, Song W, et al. Association of chemokine receptor gene (CCR2-CCR5) haplotypes with acquisition and control of HIV-1 infection in Zambians. Retrovirology. 2011;8: Petrovski S, Fellay J, Shianna KV, et al. Common human genetic variants and HIV-1 susceptibility: a genome-wide survey in a homogeneous African population. AIDS. 2010;25(4): Lingappa JR, Petrovski S, Kahle E, et al. Genomewide association study for determinants of HIV-1 acquisition and viral set point in HIV-1 serodiscordant couples with quantified virus exposure. PLoS One. 2011;6(12):e Limou S, Delaneau O, van Manen D, et al. Multicohort genomewide association study reveals a new signal of protection against HIV-1 acquisition. J Infect Dis. 2012;205(7): National Institute of Allergy and Infectious Diseases. What are HIV and AIDS? HIV/AIDS 2008; DS.aspx. Accessed November 27, 2011.

76 Guergnon J, Theodorou I. What did we learn on host's genetics by studying large cohorts of HIV-1-infected patients in the genome-wide association era? Curr Opin HIV AIDS. 2011;6(4): Illumina Inc. Human 1M-Duo DNA Analysis BeadChip Kits Chapman SJ, Hill AV. Human genetic susceptibility to infectious disease. Nat Rev Genet. 2012;13(3): Fideli US, Allen SA, Musonda R, et al. Virologic and immunologic determinants of heterosexual transmission of human immunodeficiency virus type 1 in Africa. AIDS Res Hum Retroviruses. 2001;17(10): Song W, He D, Brill I, et al. Disparate associations of HLA class I markers with HIV-1 acquisition and control of viremia in an African population. PLoS One. 2011;6(8):e Tang J, Malhotra R, Song W, et al. Human leukocyte antigens and HIV type 1 viral load in early and chronic infection: predominance of evolving relationships. PLoS One. 2010;5(3):e Merino A, Malhotra R, Morton M, et al. Impact of a functional KIR2DS4 allele on heterosexual HIV-1 transmission among discordant Zambian couples. J Infect Dis. 2011;203(4): Merino AM, Song W, He D, et al. HLA-B signal peptide polymorphism influences the rate of HIV-1 acquisition but not viral load. J Infect Dis. 2012;205(12): Trask SA, Derdeyn CA, Fideli U, et al. Molecular epidemiology of human immunodeficiency virus type 1 transmission in a heterosexual cohort of discordant couples in Zambia. J Virol. 2002;76(1): Kempf MC, Allen S, Zulu I, et al. Enrollment and retention of HIV discordant couples in Lusaka, Zambia. J Acquir Immune Defic Syndr. 2008;47(1): Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):

77 Manichaikul A, Palmas W, Rodriguez CJ, et al. Population structure of Hispanics in the United States: the multi-ethnic study of atherosclerosis. PLoS Genet. 2012;8(4):e Zhu X, Li S, Cooper RS, Elston RC. A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet. 2008;82(2): International HapMap 3 Consortium, Altshuler DM, Gibbs RA, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311): Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther. 2011;13(1): Browning BL, Yu Z. Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genomewide association studies. Am J Hum Genet. 2009;85(6): de Bakker PI, McVean G, Sabeti PC, et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 2006;38(10): Gao X, Starmer J, Martin ER. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet Epidemiol. 2008;32(4): Gao X, Becker LC, Becker DM, Starmer JD, Province MA. Avoiding the high Bonferroni penalty in genome-wide association studies. Genet Epidemiol. 2010;34(1): Singh P, Kaur G, Sharma G, Mehra NK. Immunogenetic basis of HIV-1 infection, transmission and disease progression. Vaccine. 2008;26(24): Kaur G, Mehra N. Genetic determinants of HIV-1 infection and progression to AIDS: immune response genes. Tissue Antigens. 2009;74(5):

78 Santin I, Castellanos-Rubio A, Aransay AM, et al. Exploring the diabetogenicity of the HLA-B18-DR3 CEH: independent association with T1D genetic risk close to HLA-DOA. Genes Immun. 2009;10(6): Souwer Y, Chamuleau ME, van de Loosdrecht AA, et al. Detection of aberrant transcription of major histocompatibility complex class II antigen presentation genes in chronic lymphocytic leukaemia identifies HLA-DOA mrna as a prognostic factor for survival. Br J Haematol. 2009;145(3): Xiu F, Côté MH, Bourgeois-Daigneault MC, et al. Cutting edge: HLA-DO impairs the incorporation of HLA-DM into exosomes. J Immunol. 2011;187(4): Horton RE, McLaren PJ, Fowke K, Kimani J, Ball TB. Cohorts for the study of HIV 1-exposed but uninfected individuals: benefits and limitations. J Infect Dis. 2010;202(Suppl 3):S An P, Winkler CA. Host genes associated with HIV/AIDS: advances in gene discovery. Trends Genet. 2010;26(3): Boily MC, Baggaley RF, Wang L, et al. Heterosexual risk of HIV-1 infection per sexual act: systematic review and meta-analysis of observational studies. Lancet Infect Dis. 2009;9(2): Shea PR, Shianna KV, Carrington M, Goldstein DB. Host genetics of HIV acquisition and viral control. Annu Rev Med. 2013;64(13):1-15.

79 FIGURE 1. Minus log p value plot for analysis of HIV-1 acquisition within the major histocompatibility complex. log 10 (p) is plotted for all SNPs against the their physical location across chromosome six. No SNP reached the threshold for statistical significance (p < 2.8x10-5, log 10 (p)=4.5 [red dotted line]).

80 Percent HIV-1 Seronegative AA AG GG Log Rank P= Time-to-Tranmission (Weeks) FIGURE 2. Kaplan-Meier curve of time-to-transmission (in weeks) according to rs genotypes (a variant in the UTR region of the HLA-DOA gene).

81 67 TABLE 1. Overall characteristics of 212 seroconverters (SCs) and 227 exposed seronegatives (ESNs) with SNP genotyping results. SCs (N = 212) ESNs (N = 227) p Sex ratio (M/F) 0.7 (84/128) 1.2 (122/105) Age at enrollment: mean ± SD (yr) 28.1 ± ± 8.5 <0.001 Dates of enrollment Earliest Mar 1995 Mar 1995 Latest Feb 2008 Jan 2006 Duration of follow-up: median (IQR) (weeks) 71 (29-152) 222 ( ) <0.001 Covariates: N (%) HLA-A*68:02 40 (18.9) 28 (12.3) P2-Met carrier 104 (50.5) (n=206) 78 (40.8) (n=191) <0.001 Index partner HLA-A*36 35 (16.5) 15 (6.6) (n=226) Index partner KIR2DS4 187 (89.5) (n=209) 171 (75.3) (n=225) Index partner log 10 viral load: mean ± SD 5.0 ± 0.7 (n=208) 4.5 ± 1.0 (n=226) <0.001

82 68 TABLE 2. Summary of additive Cox proportional hazard models for HIV-1 acquisition. * SNP Position Gene Class SCs ESNs Analysis 1 Analysis 2 Allele MAF MAF HR 95% CI p HR 95% CI p rs ,080,668 HLA-DOA UTR G 1.6 (1.2, 2.1) 3.00E (1.2, 2.1) 4.00E-04 rs ,045,812 HLA-A HCG9 intergenic A 1.2 (1.0, 1.5) (1.1, 1.7) rs ,805,719 FLOT1 intronic G 1.4 (1.1, 1.7) (1.1, 1.7) rs ,798,917 TUBB intronic A 1.4 (1.1, 1.7) (1.1, 1.7) rs ,920,476 IER3 DDR1 intergenic G 1.6 (1.2, 2.0) 2.00E (1.1, 1.8) rs ,245,845 TRIM15 intronic G 0.9 (0.7, 1.1) (0.6, 0.9) rs ,426,588 C6orf10 intronic G 1.4 (1.1, 1.7) (1.1, 1.8) rs ,428,131 C6orf10 intronic C 1.4 (1.1, 1.7) (1.1, 1.8) rs ,444,165 C6orf10 intronic A 1.4 (1.1, 1.7) (1.1, 1.8) rs ,444,473 C6orf10 intronic T 1.4 (1.1, 1.7) (1.1, 1.8) rs ,923,930 IER3 DDR1 intergenic A 1.6 (1.3, 2.0) 2.00E (1.1, 1.8) rs ,925,858 IER3 DDR1 intergenic A 1.6 (1.3, 2.0) 2.00E (1.1, 1.8) rs ,940,042 IER3 DDR1 intergenic A 1.6 (1.3, 2.0) 2.00E (1.1, 1.8) rs ,952,239 IER3 DDR1 intergenic T 1.6 (1.3, 2.0) 2.00E (1.1, 1.8) rs ,881,218 IER3 DDR1 intergenic C 0.6 (0.4, 0.9) (0.3, 0.8) rs ,881,293 IER3 DDR1 intergenic G 0.6 (0.4, 0.9) (0.3, 0.8) rs ,884,600 IER3 DDR1 intergenic A 0.6 (0.4, 0.9) (0.3, 0.8) rs ,888,915 LINC00243 ncrna A 0.6 (0.4, 0.9) (0.3, 0.8) rs ,890,302 LINC00243 ncrna G 0.6 (0.4, 0.9) (0.3, 0.8) rs ,900,214 LINC00243 intronic C 0.6 (0.4, 0.9) (0.3, 0.8) rs ,900,343 LINC00243 intronic G 0.6 (0.4, 0.9) (0.3, 0.8) rs ,233,836 HLA-DPB2 COL11A2 intergenic C 1.4 (1.1, 1.7) (1.1, 1.7) rs ,353,348 NOTCH4 C6orf10 intergenic T 1.4 (1.1, 1.7) (1.1, 1.8) rs ,367,505 NOTCH4 C6orf10 downstream C 1.4 (1.1, 1.7) (1.1, 1.8) rs ,591,947 UBD intergenic C 1.2 (1.0, 1.5) (1.1, 1.7) rs ,592,645 UBD intergenic C 1.2 (1.0, 1.5) (1.1, 1.7) rs ,594,194 UBD intergenic G 1.2 (1.0, 1.5) (1.1, 1.7) rs ,243,940 COL11A2 intronic C 1.2 (1.0, 1.5) (1.1, 1.6) rs ,585,219 UBD intergenic G 1.2 (1.0, 1.5) (1.1, 1.7) rs ,040,359 HLA-A HCG9 intergenic T 1.2 (1.0, 1.4) (1.1, 1.6) * Results where P 0.01 through Analysis 2 are presented; Donor=index partner Analysis 1 adjusting for age at enrollment and gender Analysis 2 adjusting for age at enrollment, gender, HLA-A*68:02, P2 Met carrier, donor HLA-A*36, donor KIR2DS4, donor log 10 VL, and duration of follow-up

83 69 TABLE S1. Univariable results of covariates included in multivariable models for analysis of HIV-1 acquisition. * Covariates HR 95% CI p Age at enrollment (per year) 0.9 ( ) <0.001 Female gender 1.6 ( ) HLA-A*68: ( ) P2-Met carrier 1.4 ( ) Index partner HLA-A* ( ) <0.001 Index partner KIR2DS4 2.4 ( ) <0.001 Index partner log 10 viral load 1.8 ( ) <0.001 * Univariable models

84 70 TABLE S2. Comparison of overall characteristics for seroconverters (SCs) and exposed seronegative (ESNs) included for multivariable analysis versus individuals excluded. Analyzed SCs Excluded SCs Analyzed ESNs Exclude ESNs p (N = 203) (N = 9) (N = 189) (N = 38) p Sex ratio (M/F) 0.7 (81/122) 0.5 (3/6) (87/102) 11.7 (35/3) <0.001 Age at enrollment: mean ± SD (yr) 28.1 ± ± ± ± Duration of follow-up: median (IQR) (weeks) 72 (29-155) 62 (7-93) ( ) 191 ( ) Covariates: N (%) HLA-A*68:02 39 (19.2) 1 (11.1) (13.2) 3 (7.9) P2-Met carrier 103 (50.7) 1 (33.3) (n=3) (41.3) 0 (0.0) (n=2) Index partner HLA-A*36 35 (17.2) 0 (0.0) (6.4) 3 (8.1) (n=37) Index partner KIR2DS4 181 (89.2) 6 (100.0) (n=6) (77.3) 25 (69.4) (n=36) Index partner log 10 viral load: mean ± SD 5.0 ± ± 0.7(n=5) ± ± 0.9 (n=37) 0.177

85 71 APPLICATION OF FINE-MAPPING WITHIN THE EXTENDED HUMAN MAJOR HISTOCOMPATIBILITY COMPLEX IN IDENTIFICATION OF NOVEL VARIANTS WITH VIRAL CONTROL DURING TWO DISTINCT STAGES OF HIV-1 INFECTION by HEATHER A. PRENTICE, NICHOLAS M. PAJEWSKI, KUI ZHANG, ELIZABETH E. BROWN, RICHARD A. KASLOW, AND JIANMING TANG In preparation for PLoS Genetics Format adapted for dissertation

86 72 ABSTRACT A number of genome-wide association studies have verified the strong relationship between HLA-B alleles and immune control of HIV-1 VL. Fine-mapping may be an advantageous alternative strategy compared to standard genome-wide approaches. We tested the utility of a genotyping platform designed for fine-mapping of immune-related genes in the identification of novel genetic variants associated with VL control during early chronic and chronic infection. We investigated 6,865 single nucleotide polymorphisms (SNPs) within the human major histocompatibility complex in a population of 172 seroconverters (SCs) with an estimated date of infection and 449 seroprevalent individuals (SPs) from Zambia. We utilized a combination of linear regression models and penalized approaches (HyperLasso), adjusting for genetic factors known to be associated with VL in our cohort, first for set-point VL in SCs and then for chronic VL in SPs. Even after accounting for HLA variants known to modulate VL, the HyperLasso models suggested rs (intergenic between RPP21 and HLA-E) and rs (intronic SNP in NOTCH4) had favorable and unfavorable associations with set-point VL in SCs, respectively. In HyperLasso analysis of chronic VL, rs , downstream from HLA-DOB, was associated with an unfavorable VL. Although our finemapping experiment failed to directly identify variants with a clear functional role in HIV-1 pathogenesis, the novel genetic signals identified in our study may be of interest for further investigation. INTRODUCTION In the era of genome-wide association studies (GWAS), the search for definitive host genetic correlates of effective HIV-1 VL control has yet to yield convincing results

87 73 beyond those already identified through hypothesis-driven approaches. The most consistent observations center on the importance of human leukocyte antigen (HLA) class I genes, although other candidates within the human major histocompatibility complex (MHC) region on chromosome 6p21 may play a role 1,2. From eight reported GWAS 3-10, only two single nucleotide polymorphisms (SNPs) within the MHC have consistently shown favorable impact on set-point viral load (VL): rs within the HCP5 gene and rs located 35 kb upstream of HLA-C. Follow-up studies have noted that rs at least is in linkage disequilibrium (LD) with HLA-B*57, an established allele for favorable HIV-1-related outcomes, and thus is likely to simply be tagging this allele 9,11,12. Results using other outcome measures have largely been inconsistent, although rs near the ZNRD1 gene appears to be associated with favorable disease outcomes. The majority of GWAS to date have been performed in populations of European descent; two including admixed African-American individuals did not detect any robust novel associations with disease progression 9,10. While population heterogeneity, rare variants, viral diversity, and other population-specific factors may account for the lack of consensus findings in the literature 13,14, fine-mapping through dense coverage of informative regions may offer an alternative approach to further dissect genotypephenotype relationships in the context of HIV-1 infection. Potential associations missed by conventional GWAS scans may simply reflect the lack of coverage, as demonstrated by McLaren et al. in a fine-mapping study of African-American HIV-1 controllers 15. In sub-saharan African populations that have the greatest burden of HIV/AIDS and profound genetic diversity, fine-mapping of genetic factors related to immune control of

88 74 HIV-1 infection may help to steer future translational research. Furthermore, lower levels of LD (due to increased recombination) make sub-saharan populations an ideal candidate for fine-mapping exercises, especially with the density of genotyping afforded by the current generation of genotyping arrays. Relying on a large Zambian cohort with longitudinal data related to HIV-1 control, earlier analyses have identified several favorable and unfavorable host genetic factors that have cumulative impact on VL control ; these studies have also highlighted the evolving relationships dictated by viral evolution and adaptation to the host environment Here we demonstrate that fine-mapping of the human MHC region can reveal novel associations within this Zambian cohort, both for early VL in recent HIV-1 seroconverters with a known estimated date of infection (EDI) and chronic VL in chronically infected (seroprevalent) individuals with unknown duration of HIV-1 infection. MATERIALS AND METHODS Subjects. We studied 724 HIV-1 positive individuals enrolled in the Zambia- Emory HIV Research Project (ZEHRP) cohort, including 242 seroconverting individuals (SCs) with an established EDI and 482 seroprevalent individuals (SPs). The design and structure for this cohort has been described in detail previously This study was approved by the institutional review board at Lusaka, Emory University, and the University of Alabama at Birmingham; all subjects gave written informed consent for participation.

89 75 Subject selection. Samples were excluded based on three quality control criteria: call rates <0.95, labeled duplicates, or failing sex determination. Seven samples corresponding to technical duplicates and 11 samples with call rates below 95 percent were excluded. No samples were excluded based on ambiguities in sex determination. For pairs of individuals estimated to be third degree relatives or greater, we used the unrelated selection procedure described in Manichaikul et al. and implemented in the KING software package (nine individuals excluded) 28,29. Population stratification. Population stratification was assessed using multidimensional scaling (MDS) implemented in KING 29,30. We included data on 1,184 unrelated individuals from the eleven populations in Phase 3 of the International HapMap Project. The HapMap 3 samples were genotyped on both the Illumina 1M and Affymetrix 6.0 genome-wide arrays. For the MDS analysis, we used a subset of SNPs (~30,000) that 1) were annotated with rsids on the ImmunoChip, 2) were outside of regions of known extended LD in European populations, and 3) could be reliably aligned with the HapMap 3 data (i.e. removing ambiguous A/T and C/G SNPs) 31. Four outlying samples were excluded from further analysis. Virologic assessment. Plasma VL was measured as HIV-1 RNA (copies/ml) using the Roche Amplicor 1.0 assay (Roche Diagnostics Systems Inc., Branchburg, NJ) with a lower limit of detection below 400 copies/ml. In all SCs, set-point VL was calculated as the log 10 transformed geometric mean of all available VL measurements collected between three and 12 months after the EDI (median number of 2 VL measurements); 49 SCs were exclude for lack of VL information during the required time period. In all SPs, the first log 10 transformed VL measurement available was used for

90 76 analysis; five SPs were excluded for lack of VL information. Subjects with VL below the lower limit of detection were excluded (five SCs and 13 SPs). Genotyping. HLA class I and class II genotyping was completed to the first four digits using a combination of PCR-based methods including PCR with sequence-specific primers (SSP) (Dynal/Invirtrogen, Brown Deer, WI), automated sequence-specific oligonucleotide (SSO) probe hybridization (Innogenetics, Alpharetta, GA), and sequencing-based typing (SBT) (Abbott Molecular, Inc., Des Plaines, IL) using capillary electrophoresis and the ABI 3130xl DNA Analyzer (Applied Biosystems, Foster City, CA). Genotypes on the Illumina ImmunoChip 32 were inferred using the joint calling and haplotype phasing algorithm implemented in BEAGLECALL 33. We completed a series of data cleaning and quality control procedures, excluding SNPs based on the following criteria: missingness (call rate <0.985), MAF <0.025 (<0.015 for SPs), and deviating from HWE (p <10-6 ). Of the SNPs included on the ImmunoChip located within the extended MHC 34 ; 6,865 SNPs passed through quality control for VL analysis in SCs while 7,718 were included for VL analysis in SPs. Statistical analysis. Two alternative approaches were implemented to assess the effect of individual variants on VL control. In the standard approach, individual variants were tested in multivariable models adjusting for known associated markers in addition to gender, age at the EDI, and number of measurements to calculate the set-point VL for SCs and gender, age at VL, and DOF in SPs. HLA markers included were A*74, B*13, and B*57 for SCs and B*57, B*81, and DRB1*01:02 for SPs. Both q-values using a FDR of 0.2 and the Bonferroni correction were utilized to account for multiple testing. To

91 77 account for the widespread LD within the MHC, SimpleM was used to calculate the effective number of independent tests in order to define a Bonferroni corrected p value threshold 35,36. Based on the number of non-duplicate SNPs included on the ImmunoChip, the SimpleM estimate for the number of independent tests was 1,787, therefore a p value of 2.8x10-5 was considered significant for testing in single variant multivariable models. For the alternative approach, we utilized penalized regression (HyperLasso) 37 using all variants within the extended MHC, as well as the HLA variants noted in the individual models above. The HyperLasso model uses a hierarchical Normal- Exponential-Gamma (NEG) prior for the SNP regression coefficients, which is parameterized in terms of a shape and scale parameter. Following Vignal et al., the shape parameter was set to 1 and 100 null permutations were analyzed for both the SC and SP datasets (for a given set of non-snp covariates, i.e. non-genetic factors and previously identified HLA alleles) over a grid of scale parameter values (Supplemental Table 1) 38. The null datasets were created by randomly pairing phenotypes/non-genetic covariates with genotypes (thus maintaining the relationship between the non-genetic covariates and VLs). We then selected a value for the scale parameter that approximately produced a mean error rate of selecting one SNP into the regression model. SNPs were modeled assuming an additive effect, and were only included if the minor allele was observed at least 10 times (either heterozygote or homozygote genotypes) in the particular dataset. The normality assumption for general linear analysis of VL was examined using a Kolmogorov-Smirnov test. To account for the skewed VL distributions, overall results are presented based on results using Box-Cox transformed log 10 VL. We also considered ordinal logistic regression models using a categorized VL outcome (<10,000 copies/ml,

92 78 10, ,000 copies/ml, and >100,000 copies/ml). This analysis was performed to evaluate the robustness of any identified effects with respect to the inherent measurement error of the assay used to quantify VL. Haplotype blocks were assigned through HaploView 39. RESULTS Characteristics of seroconverting and seroprevalent individuals included for each analysis. Of the eligible individuals, 172 SCs and 449 SPs were successfully genotyped and met all inclusion criteria. In Table 1 we present the overall characteristics of individuals included in the two analysis subgroups. There were more female SCs compared to the SP group, with a male-to-female sex ratio of 0.6 versus 1.1. On average, SCs were younger than SPs. Of the previously established genetic factors tested in both groups, SPs had a higher percentage of HLA-B*57 carriers, 12.9 percent versus 7.6 percent in SCs. Mean VL tended to be lower in SCs compared to SPs (4.6 ± 0.7 and 4.8 ± 0.7, respectively). Analyses of the associations for HLA-A*74, B*13, and B*57 in SCs and B*57, B*81, and DRB1*01:02 in SPs verified their previously reported associations with VL (Supplemental Table 2). Prior reported associations with HLA-A*36 and A*74 in SPs were not replicated in this cohort. Extended MHC variants and HIV-1 set-point VL in seroconverters. No SNP reached the p value significance threshold of 2.8x10-5 in single variant models for setpoint VL, nor did any have a FDR <0.20. Table 2 lists the top variants from individual variant models with a p < Even with this cut-off, the list of variants is cumbersome

93 79 and many are in complete LD with each other, specifically variants within NOTCH4 and intergenic between RPP21 and HLA-E were largely represented. HyperLasso model analysis initially including all MHC SNPs and adjusting for gender and age found five variants to be associated with set-point VL; after further adjustment for previously identified HLA variants, two SNPs remained in the multivariable model: rs and rs (overall R 2 =28.9) (Table 3 and Figure 1). The first, rs , is an intergenic SNP between RPP21 and HLA-E; rs is intronic within NOTCH4. These two variants were also among the top hits in the individual models though adjusted p values through individual variant analysis ranked 11 th and 16 th, respectively. Extended MHC variants and HIV-1 chronic VL in seroprevalent individuals. Similar to SCs, no SNP reached the significance threshold in individual models for chronic VL in SPs either through use of p value or FDR (Table 4). After adjustment for gender, age, and HLA variants, the top SNPs in single variant models were located within the class II region of the MHC. rs , which lies between HLA-DQB1 and HLA- DQA2, was the SNP most significantly associated with chronic VL (overall R 2 =14.6, p=5.4x10-4 ). The HyperLasso analyses identified one variant downstream from HLA-DOB with chronic VL before and after additional adjustment for previously identified HLA alleles (overall R 2 =14.9) (Table 5 and Figure 2). The C allele of rs was associated with increased VL.

94 80 DISSCUSION The impact of LD is an important consideration when performing analyses that include a large number of genetic polymorphisms within specific regions of the genome. In multiple testing, it is unclear whether the GWAS standard assuming all tests are independent (p value threshold of <5x10-8 ) is still relevant, or whether a lessconservative, study-specific threshold accounting for LD can be employed. This is particularly questionable for regions like the MHC where a large number of variants tested are not necessarily independent due to extensive LD. Statistical methods that take LD into account may help to distinguish true-positive signals that have not been considered in the past due to failing to reach this conservative p value 1,8. Further, standard stepwise regression analyses fail to account for multicollinearity caused by LD, which can lead to poor performing models presenting misleading results 40. Several recent studies have demonstrated that modeling all SNPs simultaneously in a multivariable regression framework such as that employed by the HyperLasso can lead to improvements in power for SNP discovery and can help to identify multiple independent associations in the presence of LD 37,38, Through the use of HyperLasso, we were able to identify several regions within the extended MHC that may further explain the variability in host genetic control of VL. These regions were detected even after accounting for classical HLA alleles known to be associated with VL control in our Zambian cohort. Our results for rs in NOTCH4 with set-point VL are particularly noteworthy. NOTCH4 is a member of the NOTCH gene family and has a role in regulating cell fate 43. Two previous studies of HIV-1 outcomes have reported

95 81 associations with rs , a coding SNP within NOTCH4 7,43 ; Le Clerc et al. reported an unfavorable relationship with rs and disease progression. Although this variant did not occur at a high enough frequency for validation testing within the present cohort, it does appear to be in LD with rs and both have observed unfavorable associations with HIV-1 outcomes. Future candidate gene studies for NOTCH4 should help to reveal the immunological importance of this gene with HIV-1 outcomes. The association observed with rs is less clear. The variants included on the ImmunoChip may not provide sufficient coverage within this region of the MHC, there were no coding variants within HLA-E, and therefore this signal could be due to rs being in LD an underlying coding variant we were unable to capture though our genotyping platform. The association between rs and VL in SPs may also deserve attention. HLA-DOB encodes an β heterodimer that, along with HLA-DOα, is expressed in B-cell lysosomes and is involved in regulating HLA-DM-mediated loading of short antigenic peptides to MHC class II molecules Our analysis of HIV-1 acquisition also observed a signal for a variant in HLA-DOA and accelerated time-to-transmission, though not statistically significant. There have not been any prior studies to report an association between HLA-DO genes and HIV-1 outcomes, but considering the role of molecules coded by these genes, these candidate genes may be of interest for further investigation if the findings here can be validated. The lack of overlap between the results for SCs and SPs emphasizes the importance of timing of infection when using VL as a phenotype. Although our analysis of chronic VL included over twice as many SPs compared to SCs for the set-point VL

96 82 analysis, we were unable to detect a single variant in individual models that significantly explained additional variation in chronic VL. The stage of infection for SPs upon enrollment into the study is unknown because their estimated date of infection is unknown, therefore our definition of chronic VL is less precise than that for set-point VL. Similar to the observations of Tang et al., however, individual variants highlighted in the SC analysis did exhibit consistent, although diminished associations in SPs 22. The predictive value of RNA VL for HIV-1 disease progression is well established 47,48 ; but this phenotype reflects an equilibrium established between viral replication and the host immune microenvironment. As infection progresses, other factors, particularly viral adaptation, may become more prominent in mediating VL phenotypes, therefore knowledge on the timing of infection is necessary 49. Even though there is a clear advantage to the application of penalized regression models for large-scale analysis of genetic variants, these approaches are not without their own limitations. Selection of the penalty parameter controls the number of variables selected into the final model, a larger penalty results in a smaller subset 50. However, because the number of SNPs included into a model usually vastly exceeds the number of samples, regularization approaches are required to provide model stability at the expense of bias for the individual regression coefficients. However, because the focus of these models is on prediction and not on hypothesis testing, it is difficult to obtain unbiased coefficient estimates and their standard errors for variables selected into the model, which requires moving away from the usual reliance on p values in current large-scale genetic studies 50. Also, as noted by Hoggart et al., methods such as the HyperLasso typically include the best SNPs out of a pool of closely related variants for characterizing the

97 83 association under study 37. Thus, the identified models are not necessarily unique, as several competing models could provide virtually identical predictive accuracy. We were able to detect two associations, one for the NOTCH4 gene and the other for HLA-DOB, which may be relevant for future exploration through candidate gene studies if the findings presented here can be validated as true association signals. In addition, penalized regression approaches appear to be valuable for disentangling complex associations within the context of high LD in regions such as the extended MHC. ACKNOWLEDGMENTS This work was supported by multiple grants, including R01 AI (to E.H.), R37 AI (to E.H.), R01 AI (to R.A.K./J.T.) from the NIAID, UL1 RR from the Clinical Translational Science Award program, NCRR, Fogarty International Center D43 TW001042, and funding from the International AIDS Vaccine Initiative (IAVI) to S.A. This work was further supported by the IAVI Protocol C research network. REFERENCES 1. Guergnon J, Theodorou I. What did we learn on host's genetics by studying large cohorts of HIV-1-infected patients in the genome-wide association era? Curr Opin HIV AIDS. 2011;6(4): Aouizerat BE, Pearce CL, Miaskowski C. The search for host genetic factors of HIV/AIDS pathogenesis in the post-genome era: progress to date and new avenues for discovery. Curr HIV/AIDS Rep. 2011;8(1):38-44.

98 84 3. Fellay J, Shianna KV, Ge D, et al. A whole-genome association study of major determinants for host control of HIV-1. Science. 2007;317(5840): Dalmasso C, Carpentier W, Meyer L, et al. Distinct genetic loci control plasma HIV-RNA and cellular HIV-DNA levels in HIV-1 infection: the ANRS Genome Wide Association 01 study. PLoS One. 2008;3(12):e Limou S, Le Clerc S, Coulonges C, et al. Genomewide association study of an AIDS-nonprogression cohort emphasizes the role played by HLA genes (ANRS Genomewide Association Study 02). J Infect Dis. 2009;199(3): Le Clerc S, Limou S, Coulonges C, et al. Genomewide association study of a rapid progression cohort identifies new susceptibility alleles for AIDS (ANRS Genomewide Association Study 03). J Infect Dis. 2009;200(8): Fellay J, Ge D, Shianna KV, et al. Common genetic variation and the control of HIV-1 in humans. PLoS Genet. 2009;5(12):e Herbeck JT, Gottlieb GS, Winkler CA, et al. Multistage genomewide association study identifies a locus at 1q41 associated with rate of HIV-1 disease progression to clinical AIDS. J Infect Dis. 2010;201(4): Pelak K, Goldstein DB, Walley NM, et al. Host determinants of HIV-1 control in African Americans. J Infect Dis. 2010;201(8): Pereyra F, Jia X, McLaren PJ, et al. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science. 2010;330(6010): Catano G, Kulkarni H, He W, et al. HIV-1 disease-influencing effects associated with ZNRD1, HCP5 and HLA-C alleles are attributable mainly to either HLA- A10 or HLA-B*57 alleles. PLoS One. 2008;3(11):e Shrestha S, Aissani B, Song W, Wilson CM, Kaslow RA, Tang J. Host genetics and HIV-1 viral load set-point in African-Americans. AIDS. 2009;23(6): Chapman SJ, Hill AV. Human genetic susceptibility to infectious disease. Nat Rev Genet. 2012;13(3):

99 Donfack J, Buchinsky FJ, Post JC, Ehrlich GD. Human susceptibility to viral infection: the search for HIV-protective alleles among Africans by means of genome-wide studies. AIDS Res Hum Retroviruses. 2006;22(10): McLaren PJ, Ripke S, Pelak K, et al. Fine-mapping classical HLA variation associated with durable host control of HIV-1 infection in African Americans. Hum Mol Genet. 2012;21(19): Tang J, Tang S, Lobashevsky E, et al. Favorable and unfavorable HLA class I alleles and haplotypes in Zambians predominantly infected with clade C human immunodeficiency virus type 1. J Virol. 2002;76(16): Tang J, Tang S, Lobashevsky E, et al. HLA allele sharing and HIV type 1 viremia in seroconverting Zambians with known transmitting partners. AIDS Res Hum Retroviruses. 2004;20(1): Lazaryan A, Lobashevsky E, Mulenga J, et al. Human leukocyte antigen B58 supertype and human immunodeficiency virus type 1 infection in native Africans. J Virol. 2006;80(12): Song W, He D, Brill I, et al. Disparate associations of HLA class I markers with HIV-1 acquisition and control of viremia in an African population. PLoS One. 2011;6(8):e Malhotra R, Hu L, Song W, et al. Association of chemokine receptor gene (CCR2-CCR5) haplotypes with acquisition and control of HIV-1 infection in Zambians. Retrovirology. 2011;8: Yue L, Prentice HA, Farmer P, et al. Cumulative impact of host and viral factors on HIV-1 viral-load control during early infection. J Virol. 2012;87(2): Tang J, Malhotra R, Song W, et al. Human leukocyte antigens and HIV type 1 viral load in early and chronic infection: predominance of evolving relationships. PLoS One. 2010;5(3):e Tang J, Cormier E, Gilmour J, et al. Human leukocyte antigen variants B*44 and B*57 are consistently favorable during two distinct phases of primary HIV-1 infection in sub-saharan Africans with several viral subtypes. J Virol. 2011;85(17):

100 Crawford H, Matthews PC, Schaefer M, et al. The hypervariable HIV-1 capsid protein residues comprise HLA-driven CD8+ T-cell escape mutations and covarying HLA-independent polymorphisms. J Virol. 2011;85(3): Fideli US, Allen SA, Musonda R, et al. Virologic and immunologic determinants of heterosexual transmission of human immunodeficiency virus type 1 in Africa. AIDS Res Hum Retroviruses. 2001;17(10): Trask SA, Derdeyn CA, Fideli U, et al. Molecular epidemiology of human immunodeficiency virus type 1 transmission in a heterosexual cohort of discordant couples in Zambia. J Virol. 2002;76(1): Kempf MC, Allen S, Zulu I, et al. Enrollment and retention of HIV discordant couples in Lusaka, Zambia. J Acquir Immune Defic Syndr. 2008;47(1): Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22): Manichaikul A, Palmas W, Rodriguez CJ, et al. Population structure of Hispanics in the United States: the multi-ethnic study of atherosclerosis. PLoS Genet. 2012;8(4):e Zhu X, Li S, Cooper RS, Elston RC. A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet. 2008;82(2): International HapMap 3 Consortium, Altshuler DM, Gibbs RA, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311): Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther. 2011;13(1): Browning BL, Yu Z. Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genomewide association studies. Am J Hum Genet. 2009;85(6):

101 de Bakker PI, McVean G, Sabeti PC, et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 2006;38(10): Gao X, Starmer J, Martin ER. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet Epidemiol. 2008;32(4): Gao X, Becker LC, Becker DM, Starmer JD, Province MA. Avoiding the high Bonferroni penalty in genome-wide association studies. Genet Epidemiol. 2010;34(1): Hoggart CJ, Whittaker JC, De Iorio M, Balding DJ. Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLoS Genet. 2008;4(7):e Vignal CM, Bansal AT, Balding DJ. Using penalised logistic regression to fine map HLA variants for rheumatoid arthritis. Ann Hum Genet. 2011;75(6): Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21(2): Malo N, Libiger O, Schork NJ. Accommodating linkage disequilibrium in genetic-association analyses via ridge regression. Am J Hum Genet. 2008;82(2): Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics. 2009;25(6): Li J, Das K, Fu G, Li R, Wu R. The Bayesian lasso for genome-wide association studies. Bioinformatics. 2011;27(4): Le Clerc S, Coulonges C, Delaneau O, et al. Screening low-frequency SNPS from genome-wide association study reveals a new risk allele for progression to AIDS. J Acquir Immune Defic Syndr. 2011;56(3): Santin I, Castellanos-Rubio A, Aransay AM, et al. Exploring the diabetogenicity of the HLA-B18-DR3 CEH: independent association with T1D genetic risk close to HLA-DOA. Genes Immun. 2009;10(6):

102 Souwer Y, Chamuleau ME, van de Loosdrecht AA, et al. Detection of aberrant transcription of major histocompatibility complex class II antigen presentation genes in chronic lymphocytic leukaemia identifies HLA-DOA mrna as a prognostic factor for survival. Br J Haematol. 2009;145(3): Xiu F, Côté MH, Bourgeois-Daigneault MC, et al. Cutting edge: HLA-DO impairs the incorporation of HLA-DM into exosomes. J Immunol. 2011;187(4): Mellors JW, Kingsley LA, Rinaldo CR Jr, et al. Quantitation of HIV-1 RNA in plasma predicts outcome after seroconversion. Ann Intern Med. 1995;122(8): Mellors JW, Rinaldo CR Jr, Gupta P, White RM, Todd JA, Kingsley LA. Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science. 1996;272(5265): Prentice HA, Tang J. HIV-1 dynamics: a reappraisal of host and viral factors, as well as methodological issues. Viruses. 2012;4(10): Ayers KL, Cordell HJ. SNP selection in genome-wide and candidate gene studies via penalized logistic regression. Genet Epidemiol. 2010;34(8):

103 FIGURE 1. Associations of single nucleotide polymorphisms within the extended major histocompatibility complex with Box-Cox transformed log 10 viral load in seroconverters. A) Results adjusted for age at time of infection and sex; B) Results adjusted for age at time of infection, sex, HLA-A*74, B*13, and B*57. 89

104 FIGURE 2. Associations of single nucleotide polymorphisms within the extended major histocompatibility complex with Box-Cox transformed log 10 viral load in seroprevalent individuals. A) Results adjusted for age at the time of VL measurement and sex; B) Results adjusted for age at the time of VL measurement, sex, HLA-A*36, A*74, B*57, B*81, and DRB1*01:02. 90

105 91 TABLE 1. Overall characteristics of 172 seroconverters (SCs) and 449 seroprevalent subjects (SPs) with SNP genotyping results. Variables SCs SPs (N = 172) (N = 449) Sex ratio (M/F) 0.6 (66/106) 1.1 (237/212) Age: mean ± SD (yr) * 30.5 ± ± 7.9 Estimated dates of enrollment Earliest Apr 1995 Mar 1995 Latest Feb 2008 Sept 2008 Estimated dates of infection (EDI) Earliest Dec Latest May Duration of follow-up (mth): median (IQR) ( ) HLA factors: N (%) A*74 22 (12.8) 62 (13.8) B*13 4 (2.3) 17 (3.8) B*57 13 (7.6) 58 (12.9) B*81 11 (6.4) 39 (8.7) DRB1*01:02 22 (12.8) 34 (7.6) VL outcome Log 10 VL: mean ± SD 4.6 ± ± 0.7 VL Categories: N (%) <10,000 copies/ml 38 (22.1) 76 (16.9) 10, ,000 copies/ml 83 (48.3) 174 (38.8) >100,000 copies/ml 51 (29.6) 199 (44.3) * Age: obtained at EDI for SCs; obtained at viral load collection for SPs Viral load (VL): calculated as the geometric mean of all VL measurements collected 3-12 months after the EDI for SCs; first available VL measurement for SPs

106 92 TABLE 2. Summary of regression analyses for geometric mean set-point viral load in HIV-1 seroconverters. * Analysis 1 Analysis 2 Associated Associated Box-Cox Box-Cox Haplotype Clinical Clinical SNP Gene Class MAF Allele Log Block 10 VL Log Values 10 VL Values R 2 Log p 10 OR R 2 Log p 10 VL VL OR rs OR12D3 OR12D2 intergenic T E rs RPP21 HLA-E intergenic T E rs HLA-DOA intronic T E E rs HCG17 intergenic T E E rs OR12D3 OR12D2 intergenic A E rs NOTCH4 intronic G E E rs NOTCH4 intronic A E E rs NOTCH4 intronic A E E rs NOTCH4 intronic C E E rs RPP21 HLA-E intergenic A E rs RPP21 HLA-E intergenic C rs RPP21 HLA-E intergenic G E rs RPP21 HLA-E intergenic T E rs LY6G6F exonic A E rs RPP21 HLA-E intergenic G E rs NOTCH4 intronic G rs HCG9 intronic T E rs PPT2 intronic C rs EGFL8 / AGPAT1 complex A * Additive models; Results where P through Analysis 2 are presented Analysis 1 adjusting for age at the estimated date of infection (EDI) and gender; R 2 =7.8 Analysis 2 adjusting for age at the EDI, gender, HLA-A*74, B*13, and B*57; R 2 =20.7

107 93 TABLE 3. Summary of HyperLasso results for Box-Cox transformed log 10 geometric mean viral load in seroconverters. SNP Position Haplotype Block Gene Class MAF Allele Relation to VL Analysis 1 * rs ,051, HCG9 ncrna T Favorable rs ,446, RPP21 HLA-E intergenic G Unfavorable rs ,869, HCG4 HLA-G upstream C Favorable rs ,912, TAP2 intronic G Favorable rs ,292, NOTCH4 intronic G Unfavorable Analysis 2 rs ,431, RPP21 HLA-E intergenic C Favorable rs ,298, NOTCH4 intronic G Unfavorable * Analysis 1 adjusting for age at the estimated date of infection (EDI) and gender, R 2 =34.4 Analysis 2 adjusting for age at the EDI, gender, HLA-A*74, B*13, and B*57, R 2 =28.9

108 94 TABLE 4. Summary of regression analyses for viral load in HIV-1 seroprevalent individuals. * Analysis 1 Analysis 2 Associated Associated Box-Cox Box-Cox Haplotype Clinical Clinical SNP Gene Class MAF Allele Log Block 10 VL Log Values 10 VL Values R 2 Log p 10 por R 2 Log p 10 VL VL por rs HLA-DQB1 HLA-DQA2 intergenic T E E rs BTNL2 HLA-DRA intergenic G E E rs HLA-DQB1 HLA-DQA2 intergenic T E E * Additive models; Results where P through Analysis 2 are presented Analysis 1 adjusting for age at VL and gender; R 2 =7.6 Analysis 2 adjusting for age at VL, gender, B*57, B*81, and DRB1*01:02; R 2 =12.2

109 95 TABLE 5. Summary of HyperLasso results for Box-Cox transformed log 10 viral load (VL) in seroprevalent individuals. SNP Position Haplotype Block Gene Class MAF Allele Relation to VL Analysis 1 * rs ,052, HCG9 ncrna G Favorable rs ,057, MUC21 intergenic C Favorable rs ,887, HLA-DQB2 HLA-DOB downstream C Unfavorable Analysis 2 rs ,887, HLA-DQB2 HLA-DOB downstream C Unfavorable * Analysis 1 adjusting for age at VL and gender; R 2 =14.8 Analysis 2 adjusting for age at VL, gender, HLA-A*36, A*74, B*57, B*81, and DRB1*01:02; R 2 =14.9

110 96 TABLE S1. Summary of null permutations to calibrate scale parameter of HyperLasso model. * SCs: COV1 SCs: COV2 SPs: COV3 SPs: COV4 Scale Mean Median Mean Median Mean Median Mean Median *Chosen scale parameter values are highlighted in bold for each analysis. COV1: Adjusted for sex and age at the estimated time of infection. COV2: Adjusted for sex, age at the estimated time of infection, and carriage of HLA-A*74, B*13, or B*57. COV3: Adjusted for sex and age at the time of VL measurement. COV4: Adjusted for sex, age at the time of VL measurement, and carriage of HLA-A*36, A*74, B*57, B*81, or DRB1*01:02.

111 97 TABLE S2. Univariable results of covariates in seroconverter (SCs) and seroprevalent (SPs) multivariable viral load (VL) analyses. * Covariates Box-Cox Log 10 VL VL Categories!! β ± SE R 2 p por (95% CI) p SC VL Age at EDI 0.2 ± ( ) Female gender -3.7 ± < ( ) <0.001 HLA-A* ± < ( ) HLA-B* ± ( ) HLA-B* ± ( ) SP VL Age at VL 0.1 ± ( ) Female gender -4.1 ± < ( ) <0.001 HLA-B* ± < ( ) <0.001 HLA-B* ± ( ) HLA-DRB1*01: ± ( ) * Univariable models!! VL categorized into three ordered levels: <10,000 copies/ml; 10, ,000 copies/ml; and >100,000 copies/ml VL: calculated as the geometric mean of all VL measurements collected 3-12 months after the EDI First available VL measurement EDI = estimated date of infection

112 98 POLYMORPHISMS IN THE IL10 GENE CLUSTER AND HIV-1 VIRAL LOAD IN EARLY AND CHRONIC INFECTION by HEATHER A. PRENTICE, NICHOLAS M. PAJEWSKI, KUI ZHANG, ELIZABETH E. BROWN, RICHARD A. KASLOW, AND JIANMING TANG In preparation for Human Genetics Format adapted for dissertation

113 99 ABSTRACT Cytokines encoded by the interleukin (IL)-10 gene cluster on chromosome one are important for both innate and adaptive immune defenses, suggesting a potential role for genetic association in HIV-1 pathogenesis and immune control. We investigated 239 single nucleotide polymorphisms (SNPs) within/near IL10, IL19, IL20, IL24, and IL10RA in a sub-saharan African population of 172 seroconverters (SCs) with an estimated date of infection and 449 seroprevalent individuals (SPs) from Lusaka, Zambia. Linear regression was performed for the two subgroups separately, adjusting for genetic and non-genetic factors previously associated with HIV-1 viral load (VL) in this cohort. No variant or haplotype was significantly associated with VL in SCs or SPs after adjustment for covariates and for multiple comparisons. At an alpha level of 0.05, six intergenic variants for IL20 and IL24 (β=-0.2, p= ) and three intergenic SNPs near IL10RA (β=-0.2, p= ) exhibited protective associations while one intronic SNP in IL10 was associated with increased VL (β=0.26, p=0.050) in SCs. In contrast, for SPs only one intergenic SNP near IL10 associated with decreased VL (β=-0.19, p=0.033). Haplotype analysis also suggested associations within the IL19 gene for both SCs and SPs. We were able to detect multiple associations within the IL10 gene cluster for VL control in native Africans, through analysis of both individual variants and haplotypes. Further work, including replication and systems biology approaches is needed to further elucidate the relationship between the variants in this gene cluster and VL control.

114 100 INTRODUCTION Interleukin (IL)-10 is an anti-inflammatory cytokine that plays a complex role in human infectious diseases 1-3. It is a critical factor in clearing viral infection 4, which implies a role for IL-10 and its closely related cytokines within the context of HIV-1 infection. Specifically, IL-10 indirectly regulates the innate and adaptive immune responses by reducing major histocompatibility complex class II expression and impairing production of pro-inflammatory cytokines, it can also activate a number of cells (i.e. natural killer cells and CD8 + T lymphocytes) involved in the immune response 3. IL-19, IL-20, and IL-24 are members of the IL-20 subfamily and play a role in tissue repair 3,5. IL-10, along with the IL-20 subfamily, has a function in a number of host defenses against infections, thus knowledge of their role in HIV-1 pathogenesis and potential for development of therapeutic agents is of interest. Prior studies have indicated that increased IL10 expression correlates with a higher viral load (VL) and lower CD4 + T cell count, particularly during later stages of HIV-1 infection A number of studies have also highlighted associations between three single nucleotide polymorphisms (SNPs) and their associated haplotypes within the promoter region of the IL10 gene and HIV-1 disease progression These associations for -592A>C (rs ), -819T>C (rs ), and -1082A>G (rs ) have been consistently observed across different ancestral populations. More recently, Shrestha et al. expanded the search to focus on variants within the IL10 gene family (including IL19, IL20, and IL24) and two IL-10 receptor genes, IL10RA and IL10RB, leading to several novel observations that illustrate the need for systematic investigations 18.

115 101 A number of hypotheses have been proposed to explain the conflicting associations with IL-10. Biologically, IL-10 regulates different molecules in different HIV-1 infected tissue compartments, and there is also the possibility variants within the IL10 gene family have pleiotropic effects 2. Epidemiologically, study timing may be crucial as the cytokine response changes with the progression of infection, resulting in diverse associations with HIV-1 outcomes 18,19. For example, Naicker et al. observed a time-dependent relationship for the -1082A>G and -592A>C genotypes and HIV-1 outcomes in a cohort of South African women 16. Understanding of how genotypic variation in the IL10 gene family, as well as epistasis with co-receptor genes, affects VL during distinct phases of infection remains unclear. The Zambian cohort of serodiscordant couples is one of the largest African cohorts to longitudinally study factors related to a number of HIV-1 outcomes, including VL control. Several host genetic factors have been previously identified as favorable or unfavorable correlates for VL control within this cohort 20-25, but associations with genetic variants within the IL10 gene family have yet to be investigated for this cohort. Such a characterization remains important given the large disease burden in sub-saharan populations. Using the Illumina ImmunoChip, a custom genotyping array designed for replication and fine-mapping of (auto)immune-related genes, we report here the association between variants within the IL10 gene family, focusing specifically on the closely related genes in a cluster on chromosome 1q32, 3 and their co-receptor genes with VL control at two distinct phases of infection: early chronic infection and chronic infection 26,27.

116 102 MATERIALS AND METHODS Subjects. We studied a sample of 724 HIV-1 positive individuals enrolled in the Zambia-Emory HIV Research Project (ZEHRP) cohort, including 242 seroconverting individuals (SCs) with an established date of infection (EDI) and 482 seroprevalent individuals (SPs). Individuals in this cohort were all antiretroviral therapy naïve and predominantly infected with HIV-1 subtype C. The design and structure for this cohort has been described in detail previously This study was approved by the institutional review board at Lusaka, Emory University, and the University of Alabama at Birmingham; all subjects gave written informed consent for participation. Subject selection. We excluded 11 samples with call rates less than 0.95; no samples displayed sex discrepancies between genetically inferred sex and what was indicated in the clinical record. For pairs of individuals estimated to be third degree relatives or greater, we used the unrelated selection procedure described in Manichaikul et al. and implemented in the KING software package (nine individuals excluded) 31. Population stratification. Population stratification was assessed using multidimensional scaling (MDS) implemented in KING 32,33. We included data on 1,184 unrelated individuals from the eleven populations in Phase 3 of the International HapMap Project. The HapMap 3 samples were genotyped on both the Illumina 1M and Affymetrix 6.0 genome-wide arrays. For the MDS analysis, we used a subset of SNPs (~30,000) that 1.) were annotated with rsids on the ImmunoChip, 2.) were outside of regions of known extended linkage disequilibrium (LD) in European populations, and 3.) could be reliably aligned with the HapMap 3 data (i.e. removing ambiguous A/T and C/G SNPs) 34. Four outlying samples were excluded from further analysis.

117 103 Virologic assessment. Plasma VL was measured as HIV-1 RNA (copies/ml) using the Roche Amplicor 1.0 assay (Roche Diagnostics Systems Inc., Branchburg, NJ), with a lower limit of detection below 400 copies/ml. In all SCs, set-point VL was calculated as the log 10 transformed geometric mean of all available VL measurements collected between three and 12 months after the EDI; 49 SCs were excluded for lack of VL information during the required time period. In all SPs, the first log 10 transformed VL measurement available was used for analysis; five SPs were excluded for lack of VL information. Subjects with VL below the lower limit of detection were excluded (five SCs and 13 SPs). Genotyping. HLA class I and class II genotyping was completed to the first four digits using a combination of PCR-based methods including PCR with sequence-specific primers (SSP) (Dynal/Invirtrogen, Brown Deer, WI), automated sequence-specific oligonucleotide (SSO) probe hybridization (Innogenetics, Alpharetta, GA), and sequencing-based typing (SBT) (Abbott Molecular, Inc., Des Plaines, IL) using capillary electrophoresis and the ABI 3130xl DNA Analyzer (Applied Biosystems, Foster City, CA). Genotypes on the Illumina ImmunoChip 35 were inferred using the joint calling and haplotype phasing algorithm implemented in BEAGLECALL 36. We completed a series of data cleaning and quality control procedures, excluding SNPs based on the following criteria: missingness (call rate <0.985), MAF <0.025, and deviating from HWE (p <10-6 ). Of the SNPs included on the ImmunoChip, 221 located within IL10, IL19, IL20, and IL24 and 18 located within IL10RA passed through quality control. Variants within IL10RB, IL20RA, and IL20RB were excluded from further analysis due to

118 104 suboptimal coverage on the ImmunoChip and few variants for each gene passed through quality control. Statistical analysis. Three complementary analysis approaches were used to assess the relationship between variants within the IL10 gene cluster and log 10 VL. First, individual SNPs were tested in single variant multivariable models assuming additive effects and adjusting for known factors associated with VL control. In the second approach, multi-snp haplotypes were tested in multivariable models through PLINK 37 after identifying distinct haplotype blocks using HaploView 38,39. Finally, we investigated the potential for epistasis by considering models with pair-wise interactions between variants within IL10 and variants within IL10RA. The multivariable models for SCs were adjusted for gender, age at EDI, and three known human leukocyte (HLA) variants with a clear relationship with VL (A*74, B*13, and B*57). The multivariable models for SPs were adjusted for gender, age at the time of VL measurement, and three HLA variants (B*57, B*81, and DRB1*01:02) previously associated with VL. The normality assumption for the regression analyses of VL was evaluated using a Kolmogorov-Smirnov test. Because there was some indication of deviations from normality for the model residuals, we also considered models using a Box-Cox transformed version of log 10 VL. In general, the results were largely consistent with or without the use of the Box-Cox transformation. We also considered ordinal logistic regression models using a categorized VL outcome (<10,000 copies/ml, 10, ,000 copies/ml, and >100,000 copies/ml). This analysis was performed to evaluate the robustness of any identified effects with respect to the inherent measurement error of the assay used to quantify VL.

119 105 RESULTS Characteristics of seroconverting and seroprevalent individuals included for each analysis. Of the eligible individuals, 172 SCs and 449 SPs were successfully genotyped and met all inclusion criteria. We present the overall characteristics of individuals included in the two analysis subgroups in Table 1. There were more female SCs compared to the SP group, with a male-to-female sex ratio of 0.6 versus 1.1. On average, SCs were younger than SPs. Of the previously established genetic factors tested in both groups, SPs had a higher percentage of HLA-B*57 carriers, 12.9 percent versus 7.6 percent in SCs. Mean log 10 VL tended to be lower in SCs compared to SPs (4.6 ± 0.7 and 4.8 ± 0.7, respectively). Individual variants in the IL10 gene cluster and VL control. No SNP within the IL10 gene cluster reached significance when considering correction for multiple testing or an FDR < 0.2 in multivariable models, both in SC VL analysis and SP VL analysis. The results for single variant models of set-point VL in SCs with a p < 0.05 in multivariable models are presented in Table 2. Of the top variants, four intergenic variants between IL20 and IL24 (rs c, rs g, rs g, and rs a) and two closely linked intergenic variants near IL24 (rs c and rs c, r ) were associated with decreased set-point VL (β=-0.2, p= ). The three intergenic variants near IL10RA associated with lower set-point VL (rs c, rs a, and rs c) (β=-0.2, p= ) also appear to be in LD with each other.

120 106 Only one variant was associated with chronic VL in SP individuals at a p < 0.05 (Table 3). rs t, an intergenic SNP near IL10, associated with lower chronic VL (β=-0.19, p=0.033). In contrast to the finding in SPs, there was no apparent association with rs and set-point VL in SCs. With the exception of the variants intergenic for IL20 and IL24, the associations observed with chronic VL were in the same direction as those observed with set-point VL, though the effect was attenuated and no longer significant at a p < Haplotypes and VL control. The haplotype analysis can be more powerful than single marker based methods if the underlying causal variants are better captured by haplotypes, or the haplotypes themselves have the direct effect on phenotypes. Still, we were unable to detect a single haplotype that significantly associated with either set-point VL or SP VL after correction for multiple testing, or an FDR <0.20. We present the haplotypes within blocks (Supplemental Figure 1) associated with set-point VL at a p < 0.05 in additive models in Table 4. It appears that for haplotypes including SNPs identified through single marker based models, the highlighted SNP for the most part represented the observed association in single marker based models. One haplotype, H1, did show a stronger relationship compared to single marker based models. Single marker based models identified rs t with increased VL (β=0.25, p=0.025), but the association appeared to be strengthened when coupled with rs c (β=0.42, p=0.003). Two haplotypes were associated with VL in SPs at a p < 0.05 (Table 5 and Supplemental Figure 2). While results for single variants were not impressive, the H10

121 107 haplotype had a strong association with decreased VL (β=-0.42, p=0.006). This haplotype included 21 SNPs located within IL20 and IL24. Epistasis and VL control. Pairing variants within IL10 with others within IL10RA did not reveal any significant interactions (p > 0.05 in all tests) (data not shown). DISCUSSION Here we highlighted several variants and haplotypes within the IL10 gene cluster potentially associated with VL outcomes, particularly with set-point VL in SCs. Several individual SNPs within the IL20 and IL24 genes, as well as the IL10RA gene, associated with a lower set-point VL. One IL10 SNP, rs , associated with a higher set-point VL. In addition, the H1 haplotype including rs and rs appeared to have a strong association with increased set-point VL. It should be remembered that results from single marker models and haplotype models may vary slightly in strength and direction because of allele changes in the haplotype analysis relative to the reference haplotype used; but overall, results from single marker models were similar to haplotype analysis. Our haplotype analysis did detect potentially novel associations missed through individual SNP analysis, both with set-point VL and chronic VL. Interestingly, these novel haplotypes were all found within the IL19 gene. Shrestha et al. also noted one SNP, rs , within the IL19 gene that was associated with CD4 + T-cell trajectory in their analysis of African American adolescents on highly active antiretroviral therapy 18. Further investigation of IL19 may be warranted since associations for variants within this gene have been observed in multiple HIV-1 outcomes across different cohorts.

122 108 None of the associations we report here have p values below the threshold set to correct for multiple testing, either in analysis of individual models or in haplotype analysis, suggesting that the results could be purely due to chance. Replication of novel findings in our results in an independent cohort is clearly needed. We were able to verify a number previously published associations for variants within the IL10 gene cluster through our cohort. Three SNPs highlighted in our results for SCs, rs (β=-0.19, p=0.010), rs (β=-0.17, p=0.029), and rs (β=- 0.17, p=0.) have all previously been identified with disease progression in a cohort of African-American individuals 18. Fellay el al. observed rs to be correlated with higher VL in a European cohort 40 ; we noticed a trend in our cohort both for association with higher set-point VL and chronic VL (β=0.16, p=0.110 and β=0.10, p=0.098 respectively). On the other hand, associations reported in other studies for -819 and could not be replicated in our cohort (p > for all models). Even though designed with the intent for fine-mapping, coverage on the ImmunoChip may not be optimal. Position -592 was not included on the ImmunoChip, therefore previously reported associations with this variant could not be tested in our cohort 11, We also did not investigate variants within the IL10RB, IL20RA, or IL20RB genes due to the low number of SNPs available on the ImmunoChip. Future study with a genotyping platform providing better coverage of these co-receptor genes may allow for identification of genetic correlates of VL control. In conclusion, our study observed associations between multiple variants across the IL10 gene cluster and VL outcomes in an African cohort. Most of our findings were for variants outside of the typically studied IL10 gene, highlighting the importance of a

123 109 comprehensive investigation of genes and their related families. Further complementary analyses were beneficial in that haplotype analysis enabled us identify novel associations beyond those identified through analysis of single variants. We were also able to confirm through our cohort a number of associations reported for other cohorts. Verification of the novel associations reported here will be necessary as none were statistically significant after correction for multiple testing. Further study will be necessary to investigate the biological pathway for this gene cluster and, if the associations reported here can be confirmed, whether any of these variants have a role in gene expression. ACKNOWLEDGMENTS This work was supported by multiple grants, including R01 AI (to E.H.), R37 AI (to E.H.), R01 AI (to R.A.K./J.T.) from the NIAID, UL1 RR from the Clinical Translational Science Award program, NCRR, Fogarty International Center D43 TW001042, and funding from the International AIDS Vaccine Initiative (IAVI) to S.A. This work was further supported by the IAVI Protocol C research network. REFERENCES 1. Moore KW, de Waal Malefyt R, Coffman RL, O'Garra A. Interleukin-10 and the interleukin-10 receptor. Annu Rev Immunol. 2001;19: Kwon DS, Kaufmann DE. Protective and detrimental roles of IL-10 in HIV pathogenesis. Eur Cytokine Netw. 2010;21(3): Hofmann SR, Rösen-Wolff A, Tsokos GC, Hedrich CM. Biological properties and regulation of IL-10 related cytokines and their contribution to autoimmune disease and tissue injury. Clin Immunol. 2012;143(2):

124 Brooks DG, Trifilo MJ, Edelmann KH, Teyton L, McGavern DB, Oldstone MB. Interleukin-10 determines viral clearance or persistence in vivo. Nat Med. 2006;12(11): Ouyang W, Rutz S, Crellin NK, Valdez PA, Hymowitz SG. Regulation and functions of the IL-10 family of cytokines in inflammation and disease. Annu Rev Immunol. 2011;29: Clerici M, Balotta C, Salvaggio A, et al. Human immunodeficiency virus (HIV) phenotype and interleukin-2/ interleukin-10 ratio are associated markers of protection and progression in HIV infection. Blood. 1996;88(2): Landay AL, Clerici M, Hashemi F, Kessler H, Berzofsky JA, Shearer GM. In vitro restoration of T cell immune function in human immunodeficiency viruspositive persons: effects of interleukin (IL)-12 and anti-il-10. J Infect Dis. 1996;173(5): Ostrowski MA, Gu JX, Kovacs C, Freedman J, Luscher MA, MacDonald KS. Quantitative and qualitative assessment of human immunodeficiency virus type 1 (HIV-1)-specific CD4+ T cell immunity to gag in HIV-1-infected individuals with differential disease progression: reciprocal interferon-gamma and interleukin-10 responses. J Infect Dis. 2001;184(10): Trabattoni D, Saresella M, Biasin M, et al. B7-H1 is up-regulated in HIV infection and is a novel surrogate marker of disease progression. Blood. 2003;101(7): Orsilles MA, Pieri E, Cooke P, Caula C. IL-2 and IL-10 serum levels in HIV-1- infected patients with or without active antiretroviral therapy. APMIS. 2006;114(1): Shin HD, Winkler C, Stephens JC, et al. Genetic restriction of HIV-1 pathogenesis to AIDS by promoter alleles of IL10. Proc Natl Acad Sci U S A. 2000;97(26): Vasilescu A, Heath SC, Ivanova R, et al. Genomic analysis of Th1-Th2 cytokine genes in an AIDS cohort: identification of IL4 and IL10 haplotypes associated with the disease progression. Genes Immun. 2003;4(6):

125 Shrestha S, Strathdee SA, Galai N, et al. Behavioral risk exposure and host genetics of susceptibility to HIV-1 infection. J Infect Dis. 2006;193(1): Erikstrup C, Kallestrup P, Zinyama-Gutsire RB, et al. Reduced mortality and CD4 cell loss among carriers of the interleukin g allele in a Zimbabwean cohort of HIV-1-infected adults. AIDS. 2007;1(17): Oleksyk TK, Shrestha S, Truelove AL, et al. Extended IL10 haplotypes and their association with HIV progression to AIDS. Genes Immun. 2009;10(4): Naicker DD, Werner L, Kormuth E, et al. Interleukin-10 promoter polymorphisms influence HIV-1 susceptibility and primary HIV-1 pathogenesis. J Infect Dis. 2009;200(3): Chatterjee A, Rathore A, Sivarama P, Yamamoto N, Dhole TN. Genetic association of IL-10 gene promoter polymorphism and HIV-1 infection in North Indians. J Clin Immunol. 2009;29(1): Shrestha S, Wiener HW, Aissani B, et al. Interleukin-10 (IL-10) pathway: genetic variants and outcomes of HIV-1 infection in African American adolescents. PLoS One. 2010;5(10):e Stacey AR, Norris PJ, Qin L, et al. Induction of a striking systemic cytokine cascade prior to peak viremia in acute human immunodeficiency virus type 1 infection, in contrast to more modest and delayed responses in acute hepatitis B and C virus infections. J Virol. 2009;83(8): Tang J, Tang S, Lobashevsky E, et al. Favorable and unfavorable HLA class I alleles and haplotypes in Zambians predominantly infected with clade C human immunodeficiency virus type 1. J Virol. 2002;76(16): Tang J, Tang S, Lobashevsky E, et al. HLA allele sharing and HIV type 1 viremia in seroconverting Zambians with known transmitting partners. AIDS Res Hum Retroviruses. 2004;20(1): Lazaryan A, Lobashevsky E, Mulenga J, et al. Human leukocyte antigen B58 supertype and human immunodeficiency virus type 1 infection in native Africans. J Virol. 2006;80(12):

126 Malhotra R, Hu L, Song W, et al. Association of chemokine receptor gene (CCR2-CCR5) haplotypes with acquisition and control of HIV-1 infection in Zambians. Retrovirology. 2011;8: Song W, He D, Brill I, et al. Disparate associations of HLA class I markers with HIV-1 acquisition and control of viremia in an African population. PLoS One. 2011;6(8):e Yue L, Prentice HA, Farmer P, et al. Cumulative impact of host and viral factors on HIV-1 viral-load control during early infection. J Virol. 2012;87(2): Tang J, Malhotra R, Song W, et al. Human leukocyte antigens and HIV type 1 viral load in early and chronic infection: predominance of evolving relationships. PLoS One. 2010;5(3):e Tang J, Cormier E, Gilmour J, et al. Human leukocyte antigen variants B*44 and B*57 are consistently favorable during two distinct phases of primary HIV-1 infection in sub-saharan Africans with several viral subtypes. J Virol. 2011;85(17): Fideli US, Allen SA, Musonda R, et al. Virologic and immunologic determinants of heterosexual transmission of human immunodeficiency virus type 1 in Africa. AIDS Res Hum Retroviruses. 2001;17(10): Trask SA, Derdeyn CA, Fideli U, et al. Molecular epidemiology of human immunodeficiency virus type 1 transmission in a heterosexual cohort of discordant couples in Zambia. J Virol. 2002;76(1): Kempf MC, Allen S, Zulu I, et al. Enrollment and retention of HIV discordant couples in Lusaka, Zambia. J Acquir Immune Defic Syndr. 2008;47(1): Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22): Zhu X, Li S, Cooper RS, Elston RC. A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet. 2008;82(2):

127 Manichaikul A, Palmas W, Rodriguez CJ, et al. Population structure of Hispanics in the United States: the multi-ethnic study of atherosclerosis. PLoS Genet. 2012;8(4):e International HapMap 3 Consortium, Altshuler DM, Gibbs RA, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311): Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther. 2011;13(1): Browning BL, Yu Z. Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genomewide association studies. Am J Hum Genet. 2009;85(6): Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3): Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21(2): Gabriel SB, Schaffner SF, Nguyen H, et al. The structure of haplotype blocks in the human genome. Science. 2002;296(5576): Fellay J, Shianna KV, Ge D, et al. A whole-genome association study of major determinants for host control of HIV-1. Science. 2007;317(5840):

128 114 TABLE 1. Overall characteristics of 172 seroconverters (SCs) and 449 seroprevalent subjects (SPs) with SNP genotyping results. Variables SCs SPs (N = 172) (N = 449) Sex ratio (M/F) 0.6 (66/106) 1.1 (237/212) Age: mean ± SD (yr) * 30.5 ± ± 7.9 Dates of enrollment Earliest 4/1/1995 3/1/1995 Latest 2/1/2008 9/1/2008 Estimated dates of infection (EDI) Earliest 12/1/ Latest 5/1/ Duration of follow-up (mth): median (IQR) ( ) HLA factors: N (%) A*74 22 (12.8) 62 (13.8) B*13 4 (2.3) 17 (3.8) B*57 13 (7.6) 58 (12.9) B*81 11 (6.4) 39 (8.7) DRB1*01:02 22 (12.8) 34 (7.6) HIV-1 viral load (VL) Log 10 VL: mean ± SD 4.6 ± ± 0.7 VL Categories: n (%) <10,000 copies/ml 38 (22.1) 76 (16.9) 10, ,000 copies/ml 83 (48.3) 174 (38.8) >100,000 copies/ml 51 (29.6) 199 (44.3) * Age: obtained at EDI for SCs; obtained at VL collection for SPs VL: calculated as the geometric mean of all VL measurements collected 3-12 months after the EDI for SCs; first available VL measurement for SPs

129 TABLE 2. Summary of models for log 10 transformed geometric mean set-point viral load in seroconverters. * Analysis 1 Analysis 2 SNP Gene Class MAF Allele Log 10 VL VL VL Log Categories 10 VL Categories Beta R 2 p por p Beta R 2 p por p rs IL20 IL24 intergenic C rs IL20 IL24 intergenic G rs IL20 IL24 downstream G rs IL10RA intergenic C rs IL20 IL24 intergenic A rs IL24 intergenic C rs IL24 intergenic C rs IL19 intronic T rs IL10RA intergenic A rs IL10RA intergenic C rs IL10 intronic T * Additive models, results where p < 0.05 through Analysis 2 are presented Analysis 1 adjusting for age at the estimated date of infection (EDI) and gender; overall R 2 =7.8 Analysis 2 adjusting for age at the EDI, gender, HLA-A*74, B*13, and B*57; overall R 2 =

130 TABLE 3. Summary of models for log 10 transformed viral load in seroprevalent partners. * Analysis 1 Analysis 2 VL VL SNP Gene Class MAF Allele Log 10 VL Categories Log 10 VL Categories Beta R 2 p por p Beta R 2 p por p rs IL10 intergenic T * Additive models, results where p < 0.05 through Analysis 2 are presented Analysis 1 adjusting for age at VL and gender; overall R 2 =7.6 Analysis 2 adjusting for age at VL, gender, HLA-B*57, B*81, and DRB1*01:02; overall R 2 =

131 TABLE 4. Haplotype analysis of log 10 transformed geometric mean set-point viral load in seroconverters. * Hap Block NSNP 5' SNP 3' SNP Gene(s) Haplotype Freq. Log 10 VL Beta p IL10, IL19, IL20, IL24 H rs rs IL19 CT H2 6 2 rs rs IL10 IL19 TA H rs rs IL19 CAGCCGACGGAACGATAGGGCTTTAT H rs rs IL10 IL19 CCTCATTCGATTCTGAT H rs rs IL19 TTGGGCTGATGAAAG H rs rs IL24 AAAATCCGTAGCTGGAGCGTC H rs rs IL19 GAACCAAT H rs rs IL10 IL19 CCTCATTTGATTCTGAT IL10RA H9 1 4 rs rs IL10RA CAAC * Additive models adjusting for age at the EDI, gender, HLA-A*74, B*13, and B*57, results where p < 0.05 are presented Haplotype block as assigned through HaploView (Supplemental Figure 1) Number of single nucleotide polymorphisms (SNPs) in haplotype block H4 and H8 only differ by a single SNP (rs ) in bold 117

132 TABLE 5. Haplotype analysis of log 10 transformed viral load in seroprevalent individuals. * Hap Block NSNP 5' SNP 3' SNP Gene Haplotype Freq. Log 10 VL Beta p IL10, IL19, IL20, and IL24 H rs rs IL24 GCGATCCGTAGCTGGAGCGTC H rs rs IL19 AGTTA * Additive models adjusting for age at VL, gender, HLA-B*57, B*81, and DRB1*01:02, results where p < 0.05 are presented Haplotype block as assigned through HaploView (Supplemental Figure 2) Number of single nucleotide polymorphisms (SNPs) in haplotype block 118

133 119 a. b. FIGURE S1. Haplotype blocks detected for seroconverters in a) IL10, IL19, IL20, and IL24 and b) IL10RA.

134 120 a. b. FIGURE S2. Haplotype blocks detected for seroprevalent individuals in a) IL10, IL19, IL20, and IL24 and b) IL10RA.

135 121 CONCLUSIONS We found an association with a variant within the HLA-DOA gene (rs592625) and HIV-1 acquisition among Zambian serodiscordant couples. Although this association was no longer statistically significant after accounting for multiple testing, further investigation into this gene may be fruitful since HLA-DOα plays a role in suppressing antigen presentation on MHC class II cells. This is the first study to consider time of transmission in analysis of HIV-1 acquisition; and the first to account for genetic and non-genetic risk factors that may be present in either member of the couple known to modulate HIV-1 transmission, including HLA-A*68:02 and P2-Met carriage in the initially seronegative partner and HLA-A*36, KIR2DS4, and VL in the index partner. We built upon previous GWAS using VL as a phenotype 53,70,71,73,77,107 to conduct fine-mapping within the extended MHC for an exclusively African population and were able to detect novel signals outside of classic HLA alleles that appear to influence VL, both for set-point in recent SCs and during chronic infection in SPs. We exhibited the utility of penalized regression models to disentangle independent association signals with VL in regions of high LD. Our findings for rs in NOTCH4 are consistent with two previous studies detecting signals with another variant in this gene 73,108 and support the need for further biologic and epidemiologic research into the relationship between NOTCH4 and HIV-1 outcomes. Further, the signal for rs within HLA-DOB for chronic VL is of interest since our HIV-1 acquisition analysis also detected a signal within the closely related HLA-DOA gene.

136 122 We extended prior work on the IL10 gene cluster to our cohort of Zambian SCs and SPs. We demonstrated the value of a comprehensive investigation of genes and their closely related families considering haplotypes in addition to individual SNPs in analytical models. According to our findings, most individual associations with set-point VL were for variants located in the IL20, IL24, and IL10RA genes, while one signal was detected with chronic VL for a variant in the IL10 gene. We also observed associations missed through single variant models with the IL19 gene in our analysis of haplotypes. None of these associations were statistically significant after correction for multiple testing and will need confirmation before further investigation is warranted. Moreover, were also able to verify several associations previously reported for individual variants including rs , rs , and ,109. Several studies have noted the benefits of the ImmunoChip for detecting independent signals at many loci in disease association studies 83,110. We too here demonstrated the value of the ImmunoChip for identification of novel associations with three different HIV-1 phenotype definitions. In particular, use of the ImmunoChip helped identify novel associations for our cohort in the extended MHC and IL10 gene cluster that were consistently in the same direction for both SCs and SPs though the effect sizes and p values did differ. Still, there are a number of important limitations that apply to our study overall. Coverage of the ImmunoChip may be suboptimal and its use may have led us to overlook some of the host genetic variation in HIV-1 acquisition and immune control. Even though the ImmunoChip was designed to allow for fine-mapping, the majority of our important associations were not to variants known to cause coding changes. This may primarily be

137 123 due to how variants were selected for inclusion on the ImmunoChip, focusing on markers with a known relationship to immune and autoimmune function for a selected subset of diseases 84,85. This is evidenced by the fact that we were not able verify in our Zambian cohort a number of previously published associations with HIV-1 outcomes in other cohorts (i.e. rs and rs ) because these variants were not included on the array. Moreover, the ImmunoChip was designed for use within populations of European descent. Although we demonstrated the ImmunoChip performed well to distinguish continental ancestry in our Zambian cohort and the HapMap3 populations, differences in allele frequencies and shorter spans of LD may have limited our ability to find statistically significant signals. Our hope in applying the ImmunoChip to an African cohort was that the higher density of coverage on the genotyping platform and lower levels of LD in this population would improve the likelihood of identifying variants with causal genetic associations with HIV-1 outcomes. Yet, both the differences in allele frequencies and the varying haplotype diversity between Africans and Europeans, the intended ImmunoChip population, may have limited its capability to detect signals for this cohort 30,31. In addition, our sample size did not provide sufficient power to detect rare variants with low MAFs or common variants with small effect sizes. No SNP met the significance threshold needed in any analysis, regardless of analytical approach. We also did not have a secondary cohort to validate the observed signals. The relatively small size also meant that we were unable to stratify our acquisition analysis according to direction of transmission (male-to-female versus female-to-male), an important modifier of

138 124 transmission 34. A larger cohort, or a secondary cohort in which to validate the novel associations reported here, is needed for more detailed investigation. Use of the ImmunoChip proved comparable in our scan of the extended MHC, all of the top variants noted in our findings could also be found on Illumina s 1M Duo BeadChip. In contrast, the majority of SNPs highlighted in our IL10 gene cluster findings were found on the ImmunoChip only. Depending on the phenotype under investigation, at $39 per chip, the ImmunoChip is not only beneficial for identifying novel association signals, but can also be a cost beneficial alternative to standard genome-wide chips. Given the burden of disease in sub-saharan African populations, discovery of novel variants that influence HIV-1 outcomes is important in advancing personalized medicine that may help prevent infection or help in treatment of disease after infection. These variants can also direct new initiatives for research in development of vaccines. The findings presented here are largely preliminary, they will need to be tested to see if the association still holds, either through increasing the sample size of this cohort or through validation analyses in a second cohort. If they can be validated as true association signals, our findings here help to guide future research directives. A more detailed understanding of how these novel genetic regions may influence HIV-1 outcomes will be essential.

139 125 GENERAL LIST OF REFERENCES 1. Joint United Nations Programme on HIV/AIDS. Global report: UNAIDS report on the global AIDS epidemic 2012: World Health Organization; Daar ES, Moudgil T, Meyer RD, Ho DD. Transient high levels of viremia in patients with primary HIV-1 infection. N Engl J Med. 1991;324(14): de Wolf F, Spijkerman I, Schellekens PT, et al. AIDS prognosis based on HIV-1 RNA, CD4+ T-cell count and function: markers with reciprocal predictive value over time after seroconversion. AIDS. 1997;11(15): Lavreys L, Baeten JM, Chohan V, et al. Higher set point plasma viral load and more-severe acute HIV type 1 (HIV-1) illness predict mortality among high-risk HIV-1-infected African women. Clin Infect Dis. 2006;42(9): Geskus RB, Prins M, Hubert JB, et al. The HIV RNA setpoint theory revisited. Retrovirology. 2007;4: Girard MP, Osmanov S, Assossou OM, Kieny MP. Human immunodeficiency virus (HIV) immunopathogenesis and vaccine development: a review. Vaccine. 2011;29(37): Mellors JW, Rinaldo CR Jr, Gupta P, White RM, Todd JA, Kingsley LA. Prognosis in HIV-1 infection predicted by the quantity of virus in plasma. Science. 1996;272(5265): Douek DC, Brenchley JM, Betts MR, et al. HIV preferentially infects HIVspecific CD4+ T cells. Nature. 2002;417(6884): National Institute of Allergy and Infectious Diseases. What are HIV and AIDS? HIV/AIDS 2008; DS.aspx. Accessed November 27, 2011.

140 Mellors JW, Kingsley LA, Rinaldo CR Jr, et al. Quantitation of HIV-1 RNA in plasma predicts outcome after seroconversion. Ann Intern Med. 1995;122(8): Serwadda D, Gray RH, Wawer MJ, et al. The social dynamics of HIV transmission as reflected through discordant couples in rural Uganda. AIDS. 1995;9(7): Operskalski EA, Stram DO, Busch MP, et al. Role of viral load in heterosexual transmission of human immunodeficieny virus type 1 by blood transfusion recipients. Transfusion Safety Study Group. Am J Epidemiol. 1997;146(8): Sewankambo N, Gray RH, Wawer MJ, et al. HIV-1 infection associated with abnormal vaginal flora morphology and bacterial vaginosis. Lancet. 1997;350(9077): Farzadegen H, Hoover DR, Astemborski J, et al. Sex differences in HIV-1 viral load and progression to AIDS. Lancet. 1998;352(9139): Leynaert B, Downs AM, de Vincenzi I. Heterosexual transmission of human immunodeficiency virus: variability of infectivity throughout the course of infection. European Study Group on Heterosexual Transmission of HIV. Am J Epidemiol. 1998;148(1): Pedraza MA, del Romero J, Roldán F, et al. Heterosexual transmission of HIV-1 is associated with high plasma viral load levels and a positive viral isolation in the infected partner. J Acquir Immune Defic Syndr. 1999;21(2): Sterling TR, Lyles CM, Vlahov D, J A, Margolick JB, TC Q. Sex differences in longitudinal human immunodeficiency virus type 1 RNA levels among seroconverters. J Infect Dis. 1999;180(3): Bienzle D, MacDonald KS, Smaill FM, et al. Factors contributing to the lack of human immunodeficiency virus type 1 (HIV-1) transmission in HIV-1-discordant partners. J Infect Dis. 2000;182(1): Long EM, Martin HL Jr, Kreiss JK, et al. Gender differences in HIV-1 diversity at time of infection. Nat Med. 2000;6(1):71-75.

141 Quinn TC, Wawer MJ, Sewankambo N, et al. Viral load and heterosexual transmission of human immunodeficiency virus type 1. Rakai Project Study Group. N Engl J Med. 2000;342(13): Fideli US, Allen SA, Musonda R, et al. Virologic and immunologic determinants of heterosexual transmission of human immunodeficiency virus type 1 in Africa. AIDS Res Hum Retroviruses. 2001;17(10): Galvin SR, Cohen MS. The role of sexually transmitted diseases in HIV transmission. Nat Rev Microbiol. 2004;2(1): Kaur G, Mehra N. Genetic determinants of HIV-1 infection and progression to AIDS: susceptibility to HIV infection. Tissue Antigens. 2009;73(4): An P, Nelson GW, Wang L, et al. Modulating influence on HIV/AIDS by interacting RANTES gene variants. Proc Natl Acad Sci U S A. 2002;99(15): Chapman SJ, Hill AV. Human genetic susceptibility to infectious disease. Nat Rev Genet. 2012;13(3): Guergnon J, Theodorou I. What did we learn on host's genetics by studying large cohorts of HIV-1-infected patients in the genome-wide association era? Curr Opin HIV AIDS. 2011;6(4): Fellay J, Shianna KV, Telenti A, Goldstein DB. Host genetics and HIV-1: the final phase? PLoS Pathog. 2010;6(10):e McCarthy MI, Abecasis GR, Cardon LR, et al. Genome-wide association studies for complex traits: concensus, uncertainty and challenges. Nat Rev Genet. 2008;9(5): Rosenberg NA, Huang L, Jewett EM, Szpiech ZA, Jankovic I, Boehnke M. Genome-wide association studies in diverse populations. Nat Rev Genet. 2010;11(5): Donfack J, Buchinsky FJ, Post JC, Ehrlich GD. Human susceptibility to viral infection: the search for HIV-protective alleles among Africans by means of genome-wide studies. AIDS Res Hum Retroviruses. 2006;22(10):

142 Teo YY, Small KS, Kwiatkowski DP. Methodological challenges of genomewide association analysis in Africa. Nat Rev Genet. 2010;11(2): Fellay J. Host genome influences on HIV-1 disease. Antivir Ther. 2009;14(6): An P, Winkler CA. Host genes associated with HIV/AIDS: advances in gene discovery. Trends Genet. 2010;26(3): Kaslow RA, Dorak T, Tang JJ. Influence of host genetic variation on susceptibility to HIV type 1 infection. J Infect Dis. 2005;191(Suppl 1):S Lederman MM, Alter G, Daskalakis DC, et al. Determinants of protection among high risk HIV-exposed seronegative persons: an overview. J Infect Dis. 2010;202(Suppl 3):S Shea PR, Shianna KV, Carrington M, Goldstein DB. Host genetics of HIV acquisition and viral control. Annu Rev Med. 2013;64(13): Singh P, Kaur G, Sharma G, Mehra NK. Immunogenetic basis of HIV-1 infection, transmission and disease progression. Vaccine. 2008;26(24): Kinter A, Arthos J, Cicala C, Fauci AS. Chemokines, cytokines and HIV: a complex network of interactions that influence HIV pathogenesis. Immunol Rev. 2000;177(88-98). 39. Carrington M, O'Brien SJ. The influence of HLA genotype on AIDS. Annu Rev Med. 2003;54: Kelleher AD, Long C, Holmes EC, et al. Clustered mutations in HIV-1 gag are consistently required for escape from HLA-B27-restricted cytotoxic T lymphocyte responses. J Exp Med. 2001;193(3): Allen TM, Altfeld M, Geer SC, et al. Selective escape from CD8+ T-cell responses represents a major driving force of human immunodeficiency virus type 1 (HIV-1) sequence diversity and reveals constraints on HIV-1 evolution. J Virol. 2005;79(21):

143 Ammaranond P, Zaunders J, Satchell C, van Bockel D, Cooper DA, Kelleher AD. A new variant cytotoxic T lymphocyte escape mutation in HLA-B27-positive individuals infected with HIV type 1. AIDS Res Hum Retroviruses. 2005;21(5): Martinez-Picado J, Prado JG, Fry EE, et al. Fitness cost of escape mutations in p24 Gag in association with control in human immunodeficiency virus type 1. J Virol. 2006;80(7): Salgado M, Simón A, Sanz-Minguela B, et al. An additive effect of protective host genetic factors correlates with HIV nonprogression status. J Acquir Immune Defic Syndr. 2011;56(4): Fabio G, Scorza R, Lazzarin A, et al. HLA-associated susceptibility to HIV-1 infection. Clin Exp Immunol. 1992;87(1): Rowland-Jones S, Sutton J, Ariyoshi K, et al. HIV-specific cytotoxic T-cells in HIV-exposed but uninfected Gamian women. Nat Med. 1995;1(1): Fowke KR, Nagelkerke NJ, Kimani J, et al. Resistance to HIV-1 infection among persistently seronegative prostitutes in Nairobi, Kenya. Lancet. 1996;348(9038): Rowland-Jones S, Dong T, Fowke KR, et al. Cytotoxic T cell responses to multiple conserved HIV epitopes in HIV-resistant prostitutes in Nairobi. J Clin Invest. 1998;102(9): Moore CB, John M, James IR, Christiansen FT, Witt CS, Mallal S. Evidence of HIV-1 adaptation to HLA-restricted immune responses at a population level. Science. 2002;296(5572): Merino A, Malhotra R, Morton M, et al. Impact of a functional KIR2DS4 allele on heterosexual HIV-1 transmission among discordant Zambian couples. J Infect Dis. 2011;203(4): Telenti A, McLaren P. Genomic approaches to the study of HIV-1 acquisition. J Infect Dis. 2010;202(S3):S

144 Petrovski S, Fellay J, Shianna KV, et al. Common human genetic variants and HIV-1 susceptibility: a genome-wide survey in a homogeneous African population. AIDS. 2010;25(4): Lingappa JR, Petrovski S, Kahle E, et al. Genomewide association study for determinants of HIV-1 acquisition and viral set point in HIV-1 serodiscordant couples with quantified virus exposure. PLoS One. 2011;6(12):e Limou S, Delaneau O, van Manen D, et al. Multicohort genomewide association study reveals a new signal of protection against HIV-1 acquisition. J Infect Dis. 2012;205(7): Liu R, Paxton WA, Choe S, et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell. 1996;86(3): Dean M, Carrington M, Winkler C, et al. Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Hemophilia Growth and Development Study, Multicenter AIDS Cohort Study, Multicenter Hemophilia Cohort Study, San Francisco City Cohort, ALIVE Study. Science. 1996;273(5283): Huang Y, Paxton WA, Wolinsky SM, et al. The role of a mutant CCR5 allele in HIV-1 transmission and disease progression. Nat Med. 1996;2(11): Michael NL, Chang G, Louie LG, et al. The role of viral phenotype and CCR-5 gene defects in HIV-1 transmission and disease progression. Nat Med. 1997;3(3): Zimmerman PA, Buckler-White A, Alkhatib G, et al. Inherited resistance to HIV- 1 conferred by an inactivating mutation in CC chemokine receptor 5: studies in populations with contrasting clinical phenotypes, defined racial background, and quantified risk. Mol Med. 1997;3(1): Mann DL, Garner RP, Dayoff DE, et al. Major histocompatibility complex genotype is associated with disease progression and virus load levels in a cohort of human immunodeficiency virus type 1-infected Caucasians and African Americans. J Infect Dis. 1998;178(6):

145 Martinson JJ, Hong L, Karanicolas R, Moore JP, Kostrikis LG. Global distribution of the CCR2-64I/CCR T HIV-1 disease-protective haplotype. AIDS. 2000;14(5): Tang J, Shelton B, Makhatadze NJ, et al. Distribution of chemokine receptor CCR2 and CCR5 genotypes and their relative contribution to human immunodeficiency virus type 1 (HIV-1) seroconversion, early HIV-1 RNA concentration in plasma, and later disease progression. J Virol. 2002;76(2): Shrestha S, Strathdee SA, Galai N, et al. Behavioral risk exposure and host genetics of susceptibility to HIV-1 infection. J Infect Dis. 2006;193(1): Malhotra R, Hu L, Song W, et al. Association of chemokine receptor gene (CCR2-CCR5) haplotypes with acquisition and control of HIV-1 infection in Zambians. Retrovirology. 2011;8: McDermott DH, Beecroft MJ, Kleeberger CA, et al. Chemokine RANTES promoter polymorphism affects risk of both HIV infection and disease progression in the Multicenter AIDS Cohort Study. AIDS. 2000;14(17): Gonzalez E, Kulkarni H, Bolivar H, et al. The influence of CCL3L1 genecontaining segmental duplications on HIV-1/AIDS susceptibility. Science. 2005;307(5714): Modi WS, Lautenberger J, An P, et al. Genetic variation in the CCL18-CCL3- CCL4 chemokine gene cluster influences HIV Type 1 transmission and AIDS disease progression. Am J Hum Genet. 2006;79(1): Bhattacharya T, Stanton J, Kim EY, et al. CCL3L1 and HIV/AIDS susceptibility. Nat Med. 2009;15(10): Prentice HA, Tang J. HIV-1 dynamics: a reappraisal of host and viral factors, as well as methodological issues. Viruses. 2012;4(10): Fellay J, Shianna KV, Ge D, et al. A whole-genome association study of major determinants for host control of HIV-1. Science. 2007;317(5840):

146 Dalmasso C, Carpentier W, Meyer L, et al. Distinct genetic loci control plasma HIV-RNA and cellular HIV-DNA levels in HIV-1 infection: the ANRS Genome Wide Association 01 study. PLoS One. 2008;3(12):e Catano G, Kulkarni H, He W, et al. HIV-1 disease-influencing effects associated with ZNRD1, HCP5 and HLA-C alleles are attributable mainly to either HLA- A10 or HLA-B*57 alleles. PLoS One. 2008;3(11):e Fellay J, Ge D, Shianna KV, et al. Common genetic variation and the control of HIV-1 in humans. PLoS Genet. 2009;5(12):e Limou S, Le Clerc S, Coulonges C, et al. Genomewide association study of an AIDS-nonprogression cohort emphasizes the role played by HLA genes (ANRS Genomewide Association Study 02). J Infect Dis. 2009;199(3): Pereyra F, Jia X, McLaren PJ, et al. The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science. 2010;330(6010): Shrestha S, Aissani B, Song W, Wilson CM, Kaslow RA, Tang J. Host genetics and HIV-1 viral load set-point in African-Americans. AIDS. 2009;23(6): Pelak K, Goldstein DB, Walley NM, et al. Host determinants of HIV-1 control in African Americans. J Infect Dis. 2010;201(8): McLaren PJ, Ripke S, Pelak K, et al. Fine-mapping classical HLA variation associated with durable host control of HIV-1 infection in African Americans. Hum Mol Genet. 2012;21(19): Painter TM. Voluntary counseling and testing for couples: a high-leverage intervention for HIV/AIDS prevention in sub-saharan Africa. Soc Sci Med. 2001;53(11): Trask SA, Derdeyn CA, Fideli U, et al. Molecular epidemiology of human immunodeficiency virus type 1 transmission in a heterosexual cohort of discordant couples in Zambia. J Virol. 2002;76(1): Kempf MC, Allen S, Zulu I, et al. Enrollment and retention of HIV discordant couples in Lusaka, Zambia. J Acquir Immune Defic Syndr. 2008;47(1):

147 Genomes Project Consortium, Abecasis GR, Altshuler D, et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319): Trynka G, Hunt KA, Bockett NA, et al. Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease. Nat Genet. 2011;43(12): Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther. 2011;13(1): Nikula T, West A, Katajamaa M, et al. A human ImmunoChip cdna microarray provides a comprehensive tool to study immune responses. J Immunol Methods. 2005;303(1-2): Illumina Inc. Genotyping rare variants. San Diego Browning BL, Yu Z. Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genomewide association studies. Am J Hum Genet. 2009;85(6): de Bakker PI, McVean G, Sabeti PC, et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet. 2006;38(10): Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22): Manichaikul A, Palmas W, Rodriguez CJ, et al. Population structure of Hispanics in the United States: the multi-ethnic study of atherosclerosis. PLoS Genet. 2012;8(4):e Zhu X, Li S, Cooper RS, Elston RC. A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet. 2008;82(2):

148 International HapMap 3 Consortium, Altshuler DM, Gibbs RA, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311): Pemberton TJ, Wang C, Li JZ, Rosenberg NA. Inference of unexpected genetic relatedness among individuals in HapMap Phase III. Am J Hum Genet. 2010;87(4): MacDonald KS, Fowke KR, Kimani J, et al. Influence of HLA supertypes on susceptibility and resistance to human immunodeficiency virus type 1 infection. J Infect Dis. 2000;181(5): Tang J, Shao W, Yoo YJ, et al. Human leukocyte antigen class I genotyes in relation to heterosexual HIV type 1 transmission within discordant couples. J Immunol. 2008;181(4): Tang J, Malhotra R, Song W, et al. Human leukocyte antigens and HIV type 1 viral load in early and chronic infection: predominance of evolving relationships. PLoS One. 2010;5(3):e Li J, Das K, Fu G, Li R, Wu R. The Bayesian lasso for genome-wide association studies. Bioinformatics. 2011;27(4): Hoggart CJ, Whittaker JC, De Iorio M, Balding DJ. Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLoS Genet. 2008;4(7):e Malo N, Libiger O, Schork NJ. Accommodating linkage disequilibrium in genetic-association analyses via ridge regression. Am J Hum Genet. 2008;82(2): Bøvelstad HM, Nygård S, Borgan O. Survival prediction from clinico-genomic models--a comparative study. BMC Bioinformatics. 2009;10: Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics. 2009;25(6): Ayers KL, Cordell HJ. SNP selection in genome-wide and candidate gene studies via penalized logistic regression. Genet Epidemiol. 2010;34(8):

149 Vignal CM, Bansal AT, Balding DJ. Using penalised logistic regression to fine map HLA variants for rheumatoid arthritis. Ann Hum Genet. 2011;75(6): Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3): Gao X, Starmer J, Martin ER. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet Epidemiol. 2008;32(4): Gao X, Becker LC, Becker DM, Starmer JD, Province MA. Avoiding the high Bonferroni penalty in genome-wide association studies. Genet Epidemiol. 2010;34(1): Herbeck JT, Gottlieb GS, Winkler CA, et al. Multistage genomewide association study identifies a locus at 1q41 associated with rate of HIV-1 disease progression to clinical AIDS. J Infect Dis. 2010;201(4): Le Clerc S, Coulonges C, Delaneau O, et al. Screening low-frequency SNPS from genome-wide association study reveals a new risk allele for progression to AIDS. J Acquir Immune Defic Syndr. 2011;56(3): Shrestha S, Wiener HW, Aissani B, et al. Interleukin-10 (IL-10) pathway: genetic variants and outcomes of HIV-1 infection in African American adolescents. PLoS One. 2010;5(10):e Polychronakos C. Fine points in mapping autoimmunity. Nat Genet. 2011;43(12):

150 136 APPENDIX A COMPARISON OF SUBJECTS INCLUDED FOR EACH ANALYSIS AND REMAINING ELIGIBLE SUBJECTS WITHIN ZEHRP NOT INCLUDED FOR ANALYSIS To address potential selection bias after inclusion of individuals for genotyping and application of exclusion criteria (described in Materials and Methods section), variables of interest were compared for individuals included in each analysis to remaining individuals within the ZEHRP not included for the specific analysis. Remaining individuals included subjects eligible for genotyping but not genotyped and subjects failing to meet all inclusion criteria. In the acquisition analysis, gender and age differed between included and excluded SCs. In addition, index partner log 10 VL was lower in analyzed ESNs. For the analysis of SC set-point VL, only gender significantly differed between the groups, there was a greater percentage of females included for analysis (Table 2). Younger subjects with a longer duration of follow-up were included for analysis of SP chronic VL (Table 3). Viral load also tended to be higher in SPs included for analysis compared to those excluded, although this may be due to the exclusion of individuals with VL below the lower limit of detection. These individuals were excluded from analysis because it cannot be determined if this is an actual VL measurement or a false reading.

151 137 APPENDIX A (CONT.) TABLE 1. Demographic, genetic, and virologic characteristics of seroconverting individuals (SCs) and exposed seronegative individuals (ESNs) analyzed compared to eligible individuals from ZEHRP not included for analysis. Variables Analyzed SCs Excluded SCs p Analyzed ESNs Excluded ESNs p N Sex ratio (M/F) 0.7 (84/128) 1.1 (74/68) (122/105) 1.0 (238/232) Age at enrollment: mean ± SD (yr) 28.1 ± ± 8.3 < ± ± Dates of enrollment Earliest Mar 1995 May 1995 Mar 1995 Feb 1995 Latest Feb 2008 Apr 2011 Jan 2006 Mar 2010 DOF: median (IQR) (weeks) 71 (29-152) 67 (24-139) ( ) 192 ( ) Covariates: N (%) HLA-A*68:02 40 (18.9) 19 (21.6) (N=88) (12.3) 18 (13.2) (N=136) P2-Met carrier 104 (50.5) (N=206) 21 (46.7) (N=45) (40.8) (N=191) 21 (43.8) (N=48) Donor HLA-A*36 35 (16.5) 9 (10.3) (N=87) (6.6) (N=226) 4 (2.9) (N=136) Donor KIR2DS4 187 (89.5) (N=209) 42 (68.9) (N=61) < (76.0) (N=225) 68 (78.2) (N=87) Donor log 10 VL: mean ± SD 5.0 ± 0.7 (N=208) 4.9 ± 0.8 (N=116) ± 1.0 (N=226) 4.6 ± 0.9 (N=435) DOF=duration of follow-up; VL=viral load; Donor=Index partner

152 138 APPENDIX A (CONT.) TABLE 2. Demographic, genetic, and virologic characteristics of seroconverting individuals (SCs) analyzed compared to eligible SCs from ZEHRP not included for analysis. Variables Analyzed Excluded p N Sex ratio (M/F) 0.6 (66/106) 1.1 (161/146) Age: mean ± SD (yr) * 30.5 ± ± Estimated dates of enrollment Earliest Apr 1995 Mar 1995 Latest Feb 2008 Apr 2011 Estimated dates of infection (EDI) Earliest Dec 1995 July 1995 Latest May 2009 July 2011 HLA factors: N (%) HLA-A*74 22 (12.8) 22 (11.6) HLA-B*13 4 (2.3) 3 (1.6) HLA-B*57 13 (7.6) 21 (11.1) Log 10 VL: mean ± SD 4.6 ± ± VL Categories: n (%) >100,000 copies/ml 51 (29.6) 55 (32.9) 10, ,000 copies/ml 83 (48.3) 67 (40.1) <10,000 copies/ml 38 (22.1) 45 (27.0) * Age: obtained at EDI for SCs Viral load (VL): calculated as the geometric mean of all VL measurements collected 3-12 months after the EDI for SCs Of the remaining eligible individuals in ZEHRP, N=189 with complete genotyping and N=167 with VL data

153 139 APPENDIX A (CONT.) TABLE 3. Demographic, genetic, and virologic characteristics of seroprevalent individuals (SPs) analyzed compared to eligible SP individuals from ZEHRP not included for analysis. Variables Analyzed Excluded p N Sex ratio (M/F) 1.1 (237/212) 0.9 (351/375) Age: mean ± SD (yr) * 32.7 ± ± Estimated dates of enrollment Earliest Mar 1995 Feb 1995 Latest Sept 2008 Apr 2011 Duration of follow-up (mos): median (IQR) 17.9 ( ) 15.2 ( ) HLA factors: N (%) B*57 58 (12.9) 42 (15.7) B*81 39 (8.7) 13 (4.9) DRB1*01:02 34 (7.6) 19 (7.2) Log 10 VL: mean ± SD 4.8 ± ± VL Categories: n (%) >100,000 copies/ml 199 (44.3) 230 (36.9) 10, ,000 copies/ml 174 (38.8) 261 (41.9) <10,000 copies/ml 76 (16.9) 132 (21.2) * Age: obtained at VL collection for SPs Viral load (VL): first available VL measurement for SPs Of the remaining eligible individuals in ZEHRP, N=265 with complete genotyping and N=598 with VL data

154 140 APPENDIX B INSTITUTIONAL REVIEW BOARD APPROVAL FORM

5. Over the last ten years, the proportion of HIV-infected persons who are women has: a. Increased b. Decreased c. Remained about the same 1

5. Over the last ten years, the proportion of HIV-infected persons who are women has: a. Increased b. Decreased c. Remained about the same 1 Epidemiology 227 April 24, 2009 MID-TERM EXAMINATION Select the best answer for the multiple choice questions. There are 60 questions and 9 pages on the examination. Each question will count one point.

More information

Tutorial on Genome-Wide Association Studies

Tutorial on Genome-Wide Association Studies Tutorial on Genome-Wide Association Studies Assistant Professor Institute for Computational Biology Department of Epidemiology and Biostatistics Case Western Reserve University Acknowledgements Dana Crawford

More information

Whole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis

Whole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis HMG Advance Access published December 21, 2012 Human Molecular Genetics, 2012 1 13 doi:10.1093/hmg/dds512 Whole-genome detection of disease-associated deletions or excess homozygosity in a case control

More information

CS2220 Introduction to Computational Biology

CS2220 Introduction to Computational Biology CS2220 Introduction to Computational Biology WEEK 8: GENOME-WIDE ASSOCIATION STUDIES (GWAS) 1 Dr. Mengling FENG Institute for Infocomm Research Massachusetts Institute of Technology mfeng@mit.edu PLANS

More information

Immunodeficiency. (2 of 2)

Immunodeficiency. (2 of 2) Immunodeficiency (2 of 2) Acquired (secondary) immunodeficiencies More common Many causes such as therapy, cancer, sarcoidosis, malnutrition, infection & renal disease The most common of which is therapy-related

More information

Antigen Presentation to T lymphocytes

Antigen Presentation to T lymphocytes Antigen Presentation to T lymphocytes Immunology 441 Lectures 6 & 7 Chapter 6 October 10 & 12, 2016 Jessica Hamerman jhamerman@benaroyaresearch.org Office hours by arrangement Antigen processing: How are

More information

HIV-1 Disease-Influencing Effects Associated with ZNRD1, HCP5 and HLA-C Alleles Are Attributable Mainly to Either HLA-A10 or HLA-B*57 Alleles

HIV-1 Disease-Influencing Effects Associated with ZNRD1, HCP5 and HLA-C Alleles Are Attributable Mainly to Either HLA-A10 or HLA-B*57 Alleles HIV-1 Disease-Influencing Effects Associated with ZNRD1, HCP5 and HLA-C Alleles Are Attributable Mainly to Either HLA-A10 or HLA-B*57 Alleles Gabriel Catano 1,2., Hemant Kulkarni 1,2., Weijing He 1,2.,

More information

Host Genomics of HIV-1

Host Genomics of HIV-1 4 th International Workshop on HIV & Aging Host Genomics of HIV-1 Paul McLaren École Polytechnique Fédérale de Lausanne - EPFL Lausanne, Switzerland paul.mclaren@epfl.ch Complex trait genetics Phenotypic

More information

A VACCINE FOR HIV BIOE 301 LECTURE 10 MITALI BANERJEE HAART

A VACCINE FOR HIV BIOE 301 LECTURE 10 MITALI BANERJEE HAART BIOE 301 LECTURE 10 MITALI BANERJEE A VACCINE FOR HIV HIV HAART Visit wikipedia.org and learn the mechanism of action of the five classes of antiretroviral drugs. (1) Reverse transcriptase inhibitors (RTIs)

More information

HOST-PARASITE INTERPLAY

HOST-PARASITE INTERPLAY HOST-PARASITE INTERPLAY Adriano Casulli EURLP, ISS (Rome, Italy) HOST-PARASITE INTERPLAY WP3 (parasite virulence vs human immunity) (Parasite) Task 3.1: Genotypic characterization Task 3.6: Transcriptome

More information

Global variation in copy number in the human genome

Global variation in copy number in the human genome Global variation in copy number in the human genome Redon et. al. Nature 444:444-454 (2006) 12.03.2007 Tarmo Puurand Study 270 individuals (HapMap collection) Affymetrix 500K Whole Genome TilePath (WGTP)

More information

New Enhancements: GWAS Workflows with SVS

New Enhancements: GWAS Workflows with SVS New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences

More information

Drug Metabolism Disposition

Drug Metabolism Disposition Drug Metabolism Disposition The CYP2C19 intron 2 branch point SNP is the ancestral polymorphism contributing to the poor metabolizer phenotype in livers with CYP2C19*35 and CYP2C19*2 alleles Amarjit S.

More information

Genetics and Genomics in Medicine Chapter 8 Questions

Genetics and Genomics in Medicine Chapter 8 Questions Genetics and Genomics in Medicine Chapter 8 Questions Linkage Analysis Question Question 8.1 Affected members of the pedigree above have an autosomal dominant disorder, and cytogenetic analyses using conventional

More information

cure research HIV & AIDS

cure research HIV & AIDS Glossary of terms HIV & AIDS cure research Antiretroviral Therapy (ART) ART involves the use of several (usually a cocktail of three or more) antiretroviral drugs to halt HIV replication. ART drugs may

More information

MID 36. Cell. HIV Life Cycle. HIV Diagnosis and Pathogenesis. HIV-1 Virion HIV Entry. Life Cycle of HIV HIV Entry. Scott M. Hammer, M.D.

MID 36. Cell. HIV Life Cycle. HIV Diagnosis and Pathogenesis. HIV-1 Virion HIV Entry. Life Cycle of HIV HIV Entry. Scott M. Hammer, M.D. Life Cycle Diagnosis and Pathogenesis Scott M. Hammer, M.D. -1 Virion Entry Life Cycle of Entry -1 virion -1 Virus virion envelope Cell membrane receptor RELEASE OF PROGENY VIRUS REVERSE Co- TRANSCRIPTION

More information

Micropathology Ltd. University of Warwick Science Park, Venture Centre, Sir William Lyons Road, Coventry CV4 7EZ

Micropathology Ltd. University of Warwick Science Park, Venture Centre, Sir William Lyons Road, Coventry CV4 7EZ www.micropathology.com info@micropathology.com Micropathology Ltd Tel 24hrs: +44 (0) 24-76 323222 Fax / Ans: +44 (0) 24-76 - 323333 University of Warwick Science Park, Venture Centre, Sir William Lyons

More information

Profiling HLA motifs by large scale peptide sequencing Agilent Innovators Tour David K. Crockett ARUP Laboratories February 10, 2009

Profiling HLA motifs by large scale peptide sequencing Agilent Innovators Tour David K. Crockett ARUP Laboratories February 10, 2009 Profiling HLA motifs by large scale peptide sequencing 2009 Agilent Innovators Tour David K. Crockett ARUP Laboratories February 10, 2009 HLA Background The human leukocyte antigen system (HLA) is the

More information

Chronic HIV-1 Infection Frequently Fails to Protect against Superinfection

Chronic HIV-1 Infection Frequently Fails to Protect against Superinfection Chronic HIV-1 Infection Frequently Fails to Protect against Superinfection Anne Piantadosi 1,2[, Bhavna Chohan 1,2[, Vrasha Chohan 3, R. Scott McClelland 3,4,5, Julie Overbaugh 1,2* 1 Division of Human

More information

Cover Page. The handle holds various files of this Leiden University dissertation.

Cover Page. The handle   holds various files of this Leiden University dissertation. Cover Page The handle http://hdl.handle.net/1887/20898 holds various files of this Leiden University dissertation. Author: Jöris, Monique Maria Title: Challenges in unrelated hematopoietic stem cell transplantation.

More information

HIV-HBV coinfection in HIV population horizontally infected in early childhood between

HIV-HBV coinfection in HIV population horizontally infected in early childhood between UNIVERSITY OF MEDICINE AND PHARMACY OF CRAIOVA FACULTY OF MEDICINE HIV-HBV coinfection in HIV population horizontally infected in early childhood between 1987-1990 Supervising professor: Prof. Cupşa Augustin

More information

MID-TERM EXAMINATION

MID-TERM EXAMINATION Epidemiology 227 May 2, 2007 MID-TERM EXAMINATION Select the best answer for the multiple choice questions. There are 75 questions and 11 pages on the examination. Each question will count one point. Notify

More information

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015

More information

DETECTION OF LOW FREQUENCY CXCR4-USING HIV-1 WITH ULTRA-DEEP PYROSEQUENCING. John Archer. Faculty of Life Sciences University of Manchester

DETECTION OF LOW FREQUENCY CXCR4-USING HIV-1 WITH ULTRA-DEEP PYROSEQUENCING. John Archer. Faculty of Life Sciences University of Manchester DETECTION OF LOW FREQUENCY CXCR4-USING HIV-1 WITH ULTRA-DEEP PYROSEQUENCING John Archer Faculty of Life Sciences University of Manchester HIV Dynamics and Evolution, 2008, Santa Fe, New Mexico. Overview

More information

The major histocompatibility complex (MHC) is a group of genes that governs tumor and tissue transplantation between individuals of a species.

The major histocompatibility complex (MHC) is a group of genes that governs tumor and tissue transplantation between individuals of a species. Immunology Dr. John J. Haddad Chapter 7 Major Histocompatibility Complex The major histocompatibility complex (MHC) is a group of genes that governs tumor and tissue transplantation between individuals

More information

Supplementary Materials

Supplementary Materials 1 Supplementary Materials Rotger et al. Table S1A: Demographic characteristics of study participants. VNP RP EC CP (n=6) (n=66) (n=9) (n=5) Male gender, n(%) 5 (83) 54 (82) 5 (56) 3 (60) White ethnicity,

More information

Figure S1. Alignment of predicted amino acid sequences of KIR3DH alleles identified in 8

Figure S1. Alignment of predicted amino acid sequences of KIR3DH alleles identified in 8 Supporting Information Figure S1. Alignment of predicted amino acid sequences of KIR3DH alleles identified in 8 unrelated rhesus monkeys. KIR3DH alleles, expressed by CD14 CD16 + NK cells that were isolated

More information

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin, ESM Methods Hyperinsulinemic-euglycemic clamp procedure During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin, Clayton, NC) was followed by a constant rate (60 mu m

More information

227 28, 2010 MIDTERM EXAMINATION KEY

227 28, 2010 MIDTERM EXAMINATION KEY Epidemiology 227 April 28, 2010 MIDTERM EXAMINATION KEY Select the best answer for the multiple choice questions. There are 64 questions and 9 pages on the examination. Each question will count one point.

More information

The Major Histocompatibility Complex (MHC)

The Major Histocompatibility Complex (MHC) The Major Histocompatibility Complex (MHC) An introduction to adaptive immune system before we discuss MHC B cells The main cells of adaptive immune system are: -B cells -T cells B cells: Recognize antigens

More information

the HLA complex Hanna Mustaniemi,

the HLA complex Hanna Mustaniemi, the HLA complex Hanna Mustaniemi, 28.11.2007 The Major Histocompatibility Complex Major histocompatibility complex (MHC) is a gene region found in nearly all vertebrates encodes proteins with important

More information

GENOME-WIDE ASSOCIATION STUDIES

GENOME-WIDE ASSOCIATION STUDIES GENOME-WIDE ASSOCIATION STUDIES SUCCESSES AND PITFALLS IBT 2012 Human Genetics & Molecular Medicine Zané Lombard IDENTIFYING DISEASE GENES??? Nature, 15 Feb 2001 Science, 16 Feb 2001 IDENTIFYING DISEASE

More information

Outline. How archaics shaped the modern immune system. The immune system. Innate immune system. Adaptive immune system

Outline. How archaics shaped the modern immune system. The immune system. Innate immune system. Adaptive immune system Outline How archaics shaped the modern immune system Alan R. Rogers February 14, 2018 Why the immune system is sensitive to archaic introgression. Archaic MHC alleles The OAS1 innate immunity locus 1 /

More information

Q&A on HIV/AIDS estimates

Q&A on HIV/AIDS estimates Q&A on HIV/AIDS estimates 07 Last updated: November 2007 Understanding the latest estimates of the 2007 AIDS Epidemic Update Part one: The data 1. What data do UNAIDS and WHO base their HIV prevalence

More information

Supplementary information. Supplementary figure 1. Flow chart of study design

Supplementary information. Supplementary figure 1. Flow chart of study design Supplementary information Supplementary figure 1. Flow chart of study design Supplementary Figure 2. Quantile-quantile plot of stage 1 results QQ plot of the observed -log10 P-values (y axis) versus the

More information

Completing the CIBMTR Confirmation of HLA Typing Form (Form 2005)

Completing the CIBMTR Confirmation of HLA Typing Form (Form 2005) Completing the CIBMTR Confirmation of HLA Typing Form (Form 2005) Stephen Spellman Research Manager NMDP Scientific Services Maria Brown Scientific Services Specialist Data Management Conference 2007 1

More information

Module 2: Integration of HIV Rapid Testing in HIV Prevention and Treatment Programs

Module 2: Integration of HIV Rapid Testing in HIV Prevention and Treatment Programs Module 2: Integration of HIV Rapid Testing in HIV Prevention and Treatment Programs Purpose Pre-requisite Modules Module Time Learning Objectives To provide the participants with the basic concepts of

More information

Dan Koller, Ph.D. Medical and Molecular Genetics

Dan Koller, Ph.D. Medical and Molecular Genetics Design of Genetic Studies Dan Koller, Ph.D. Research Assistant Professor Medical and Molecular Genetics Genetics and Medicine Over the past decade, advances from genetics have permeated medicine Identification

More information

AIDS at 25. Epidemiology and Clinical Management MID 37

AIDS at 25. Epidemiology and Clinical Management MID 37 AIDS at 25 Epidemiology and Clinical Management Blood HIV Transmission transfusion injection drug use Sexual Intercourse heterosexual male to male Perinatal intrapartum breast feeding Regional HIV and

More information

On an individual level. Time since infection. NEJM, April HIV-1 evolution in response to immune selection pressures

On an individual level. Time since infection. NEJM, April HIV-1 evolution in response to immune selection pressures HIV-1 evolution in response to immune selection pressures BISC 441 guest lecture Zabrina Brumme, Ph.D. Assistant Professor, Faculty of Health Sciences Simon Fraser University http://www3.niaid.nih.gov/topics/hivaids/understanding/biology/structure.htm

More information

Name: BIOS 703 MIDTERM EXAMINATIONS (5 marks per question, total = 100 marks)

Name: BIOS 703 MIDTERM EXAMINATIONS (5 marks per question, total = 100 marks) Name: BIOS 703 MIDTERM EXAMINATIONS (5 marks per question, total = 100 marks) You will have 75 minutest to complete this examination. Some of the questions refer to Crizotinib in ROS1- Rearranged Non Small-

More information

HIV-1 Subtypes: An Overview. Anna Maria Geretti Royal Free Hospital

HIV-1 Subtypes: An Overview. Anna Maria Geretti Royal Free Hospital HIV-1 Subtypes: An Overview Anna Maria Geretti Royal Free Hospital Group M Subtypes A (1, 2, 3) B C D F (1, 2) G H J K Mechanisms of HIV-1 genetic diversification Point mutations RT error rate: ~1 per

More information

Rare Variant Burden Tests. Biostatistics 666

Rare Variant Burden Tests. Biostatistics 666 Rare Variant Burden Tests Biostatistics 666 Last Lecture Analysis of Short Read Sequence Data Low pass sequencing approaches Modeling haplotype sharing between individuals allows accurate variant calls

More information

ASSESSMENT OF THE RISK FOR TYPE 1 DIABETES MELLITUS CONFERRED BY HLA CLASS II GENES. Irina Durbală

ASSESSMENT OF THE RISK FOR TYPE 1 DIABETES MELLITUS CONFERRED BY HLA CLASS II GENES. Irina Durbală ASSESSMENT OF THE RISK FOR TYPE 1 DIABETES MELLITUS CONFERRED BY HLA CLASS II GENES Summary Irina Durbală CELL AND MOLECULAR BIOLOGY DEPARTMENT FACULTY OF MEDICINE, OVIDIUS UNIVERSITY CONSTANŢA Class II

More information

Fayth K. Yoshimura, Ph.D. September 7, of 7 HIV - BASIC PROPERTIES

Fayth K. Yoshimura, Ph.D. September 7, of 7 HIV - BASIC PROPERTIES 1 of 7 I. Viral Origin. A. Retrovirus - animal lentiviruses. HIV - BASIC PROPERTIES 1. HIV is a member of the Retrovirus family and more specifically it is a member of the Lentivirus genus of this family.

More information

AIDS at 30 Epidemiology and Clinical Epidemiology and Management MID 37

AIDS at 30 Epidemiology and Clinical Epidemiology and Management MID 37 AIDS at 30 Epidemiology and Clinical Management Blood HIV Transmission transfusion injection drug use Sexual Intercourse heterosexual male to male Perinatal intrapartum breast feeding Adults and children

More information

Innate and Cellular Immunology Control of Infection by Cell-mediated Immunity

Innate and Cellular Immunology Control of Infection by Cell-mediated Immunity Innate & adaptive Immunity Innate and Cellular Immunology Control of Infection by Cell-mediated Immunity Helen Horton PhD Seattle Biomedical Research Institute Depts of Global Health & Medicine, UW Cellular

More information

LESSON 4.6 WORKBOOK. Designing an antiviral drug The challenge of HIV

LESSON 4.6 WORKBOOK. Designing an antiviral drug The challenge of HIV LESSON 4.6 WORKBOOK Designing an antiviral drug The challenge of HIV In the last two lessons we discussed the how the viral life cycle causes host cell damage. But is there anything we can do to prevent

More information

HIV/AIDS MEASURES GROUP OVERVIEW

HIV/AIDS MEASURES GROUP OVERVIEW 2014 PQRS OPTIONS F MEASURES GROUPS: HIV/AIDS MEASURES GROUP OVERVIEW 2014 PQRS MEASURES IN HIV/AIDS MEASURES GROUP: #159. HIV/AIDS: CD4+ Cell Count or CD4+ Percentage Performed #160. HIV/AIDS: Pneumocystis

More information

Helminth worm, Schistosomiasis Trypanosomes, sleeping sickness Pneumocystis carinii. Ringworm fungus HIV Influenza

Helminth worm, Schistosomiasis Trypanosomes, sleeping sickness Pneumocystis carinii. Ringworm fungus HIV Influenza Helminth worm, Schistosomiasis Trypanosomes, sleeping sickness Pneumocystis carinii Ringworm fungus HIV Influenza Candida Staph aureus Mycobacterium tuberculosis Listeria Salmonella Streptococcus Levels

More information

2) Cases and controls were genotyped on different platforms. The comparability of the platforms should be discussed.

2) Cases and controls were genotyped on different platforms. The comparability of the platforms should be discussed. Reviewers' Comments: Reviewer #1 (Remarks to the Author) The manuscript titled 'Association of variations in HLA-class II and other loci with susceptibility to lung adenocarcinoma with EGFR mutation' evaluated

More information

Kigali Province East Province North Province South Province West Province discordant couples

Kigali Province East Province North Province South Province West Province discordant couples EXECUTIVE SUMMARY This report summarizes the processes, findings, and recommendations of the Rwanda Triangulation Project, 2008. Triangulation aims to synthesize data from multiple sources to strengthen

More information

ViiV Healthcare s Position on Prevention in HIV

ViiV Healthcare s Position on Prevention in HIV ViiV Healthcare s Position on Prevention in HIV ViiV Healthcare is a company 100% committed to HIV, and we are always looking to move beyond the status quo and find new ways of navigating the challenges

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Fig 1. Comparison of sub-samples on the first two principal components of genetic variation. TheBritishsampleisplottedwithredpoints.The sub-samples of the diverse sample

More information

Significance of the MHC

Significance of the MHC CHAPTER 7 Major Histocompatibility Complex (MHC) What is is MHC? HLA H-2 Minor histocompatibility antigens Peter Gorer & George Sneell (1940) Significance of the MHC role in immune response role in organ

More information

Lecture 6. Burr BIO 4353/6345 HIV/AIDS. Tetramer staining of T cells (CTL s) Andrew McMichael seminar: Background

Lecture 6. Burr BIO 4353/6345 HIV/AIDS. Tetramer staining of T cells (CTL s) Andrew McMichael seminar: Background Lecture 6 Burr BIO 4353/6345 HIV/AIDS Andrew McMichael seminar: Background Tetramer staining of T cells (CTL s) 1. Vβ 19: There are 52 T cell receptor (TCR) Vβ gene segments in germ line DNA (See following

More information

HIV/AIDS & Immune Evasion Strategies. The Year First Encounter: Dr. Michael Gottleib. Micro 320: Infectious Disease & Defense

HIV/AIDS & Immune Evasion Strategies. The Year First Encounter: Dr. Michael Gottleib. Micro 320: Infectious Disease & Defense Micro 320: Infectious Disease & Defense HIV/AIDS & Immune Evasion Strategies Wilmore Webley Dept. of Microbiology The Year 1981 Reported by MS Gottlieb, MD, HM Schanker, MD, PT Fan, MD, A Saxon, MD, JD

More information

Principles of Adaptive Immunity

Principles of Adaptive Immunity Principles of Adaptive Immunity Chapter 3 Parham Hans de Haard 17 th of May 2010 Agenda Recognition molecules of adaptive immune system Features adaptive immune system Immunoglobulins and T-cell receptors

More information

Mitochondrial DNA variation associated with gait speed decline among older HIVinfected non-hispanic white males

Mitochondrial DNA variation associated with gait speed decline among older HIVinfected non-hispanic white males Mitochondrial DNA variation associated with gait speed decline among older HIVinfected non-hispanic white males October 2 nd, 2017 Jing Sun, Todd T. Brown, David C. Samuels, Todd Hulgan, Gypsyamber D Souza,

More information

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol HLA and antigen presentation Department of Immunology Charles University, 2nd Medical School University Hospital Motol MHC in adaptive immunity Characteristics Specificity Innate For structures shared

More information

Copyright 2011 Joint United Nations Programme on HIV/AIDS (UNAIDS) All rights reserved ISBN

Copyright 2011 Joint United Nations Programme on HIV/AIDS (UNAIDS) All rights reserved ISBN UNAIDS DATA TABLES 2011 Copyright 2011 Joint United Nations Programme on HIV/AIDS (UNAIDS) All rights reserved ISBN 978-92-9173-945-5 UNAIDS / JC2225E The designations employed and the presentation of

More information

Major Histocompatibility Complex (MHC) and T Cell Receptors

Major Histocompatibility Complex (MHC) and T Cell Receptors Major Histocompatibility Complex (MHC) and T Cell Receptors Historical Background Genes in the MHC were first identified as being important genes in rejection of transplanted tissues Genes within the MHC

More information

Immunology - Lecture 2 Adaptive Immune System 1

Immunology - Lecture 2 Adaptive Immune System 1 Immunology - Lecture 2 Adaptive Immune System 1 Book chapters: Molecules of the Adaptive Immunity 6 Adaptive Cells and Organs 7 Generation of Immune Diversity Lymphocyte Antigen Receptors - 8 CD markers

More information

State of Alabama HIV Surveillance 2014 Annual Report

State of Alabama HIV Surveillance 2014 Annual Report State of Alabama HIV Surveillance 2014 Annual Report Prepared by: Division of STD Prevention and Control HIV Surveillance Branch Contact Person: Richard P. Rogers, MS, MPH richard.rogers@adph.state.al.us

More information

HIV acute infections and elite controllers- what can we learn?

HIV acute infections and elite controllers- what can we learn? HIV acute infections and elite controllers- what can we learn? Thumbi Ndung u, BVM, PhD KwaZulu-Natal Research Institute for Tuberculosis and HIV (K-RITH) and HIV Pathogenesis Programme (HPP), Doris Duke

More information

HIV/AIDS CLINICAL CARE QUALITY MANAGEMENT CHART REVIEW CHARACTERISTICS OF PATIENTS FACTORS ASSOCIATED WITH IMPROVED IMMUNOLOGIC STATUS

HIV/AIDS CLINICAL CARE QUALITY MANAGEMENT CHART REVIEW CHARACTERISTICS OF PATIENTS FACTORS ASSOCIATED WITH IMPROVED IMMUNOLOGIC STATUS HIV/AIDS CLINICAL CARE QUALITY MANAGEMENT CHART REVIEW CHARACTERISTICS OF PATIENTS WITH LOW CD4 COUNTS IN 2008 AND FACTORS ASSOCIATED WITH IMPROVED IMMUNOLOGIC STATUS FROM 2004 THROUGH 2008 For the Boston

More information

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK CHAPTER 6 DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK Genetic research aimed at the identification of new breast cancer susceptibility genes is at an interesting crossroad. On the one hand, the existence

More information

Diversity and Frequencies of HLA Class I and Class II Genes of an East African Population

Diversity and Frequencies of HLA Class I and Class II Genes of an East African Population Open Journal of Genetics, 2014, 4, 99-124 Published Online April 2014 in SciRes. http://www.scirp.org/journal/ojgen http://dx.doi.org/10.4236/ojgen.2014.42013 Diversity and Frequencies of HLA Class I and

More information

An Evolutionary Story about HIV

An Evolutionary Story about HIV An Evolutionary Story about HIV Charles Goodnight University of Vermont Based on Freeman and Herron Evolutionary Analysis The Aids Epidemic HIV has infected 60 million people. 1/3 have died so far Worst

More information

Genomic structural variation

Genomic structural variation Genomic structural variation Mario Cáceres The new genomic variation DNA sequence differs across individuals much more than researchers had suspected through structural changes A huge amount of structural

More information

How HIV Causes Disease Prof. Bruce D. Walker

How HIV Causes Disease Prof. Bruce D. Walker How HIV Causes Disease Howard Hughes Medical Institute Massachusetts General Hospital Harvard Medical School 1 The global AIDS crisis 60 million infections 20 million deaths 2 3 The screen versions of

More information

Evidence of HIV-1 Adaptation to HLA- Restricted Immune Responses at a Population Level. Corey Benjamin Moore

Evidence of HIV-1 Adaptation to HLA- Restricted Immune Responses at a Population Level. Corey Benjamin Moore Evidence of HIV-1 Adaptation to HLA- Restricted Immune Responses at a Population Level Corey Benjamin Moore This thesis is presented for the degree of Doctor of Philosophy of Murdoch University, 2002 I

More information

Fertility Desires/Management of Serodiscordant HIV + Couples

Fertility Desires/Management of Serodiscordant HIV + Couples Fertility Desires/Management of Serodiscordant HIV + Couples William R. Short, MD, MPH Assistant Professor of Medicine Division Of Infectious Diseases Jefferson Medical College of Thomas Jefferson University

More information

What s New in Acute HIV Infection?

What s New in Acute HIV Infection? 3 4 Disclosure I have received research grants awarded to my institution from Gilead Sciences, Inc. ntiretroviral medications have been provided by Gilead Sciences, Inc. Susan Little, M.D. Professor of

More information

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time.

Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Supplement for: CD4 cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, gender and calendar time. Anne Cori* 1, Michael Pickles* 1, Ard van Sighem 2, Luuk Gras 2,

More information

Overview: The immune responses of animals can be divided into innate immunity and acquired immunity.

Overview: The immune responses of animals can be divided into innate immunity and acquired immunity. GUIDED READING - Ch. 43 - THE IMMUNE SYSTEM NAME: Please print out these pages and HANDWRITE the answers directly on the printouts. Typed work or answers on separate sheets of paper will not be accepted.

More information

DEFINITIONS OF HISTOCOMPATIBILITY TYPING TERMS

DEFINITIONS OF HISTOCOMPATIBILITY TYPING TERMS DEFINITIONS OF HISTOCOMPATIBILITY TYPING TERMS The definitions below are intended as general concepts. There will be exceptions to these general definitions. These definitions do not imply any specific

More information

State of Alabama HIV Surveillance 2013 Annual Report Finalized

State of Alabama HIV Surveillance 2013 Annual Report Finalized State of Alabama HIV Surveillance 2013 Annual Report Finalized Prepared by: Division of STD Prevention and Control HIV Surveillance Branch Contact Person: Allison R. Smith, MPH Allison.Smith@adph.state.al.us

More information

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations.

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. Supplementary Figure. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. a Eigenvector 2.5..5.5. African Americans European Americans e

More information

VIRAL HEPATITIS: SITUATION ANALYSIS AND PERSPECTIVES IN THE AFRICAN REGION. Report of the Secretariat. CONTENTS Paragraphs BACKGROUND...

VIRAL HEPATITIS: SITUATION ANALYSIS AND PERSPECTIVES IN THE AFRICAN REGION. Report of the Secretariat. CONTENTS Paragraphs BACKGROUND... 8 April 2014 REGIONAL COMMITTEE FOR AFRICA ORIGINAL: ENGLISH PROGRAMME SUBCOMMITTEE Sixty-fourth session Brazzaville, Republic of Congo, 9 11 June 2014 Provisional agenda item 6 VIRAL HEPATITIS: SITUATION

More information

Quality Control Analysis of Add Health GWAS Data

Quality Control Analysis of Add Health GWAS Data 2018 Add Health Documentation Report prepared by Heather M. Highland Quality Control Analysis of Add Health GWAS Data Christy L. Avery Qing Duan Yun Li Kathleen Mullan Harris CAROLINA POPULATION CENTER

More information

The Human Major Histocompatibility Complex

The Human Major Histocompatibility Complex The Human Major Histocompatibility Complex 1 Location and Organization of the HLA Complex on Chromosome 6 NEJM 343(10):702-9 2 Inheritance of the HLA Complex Haplotype Inheritance (Family Study) 3 Structure

More information

Cost per HIV-infection averted (HIA) through Couples VCT in Zambia

Cost per HIV-infection averted (HIA) through Couples VCT in Zambia Cost per HIV-infection averted (HIA) through Couples VCT in Zambia M. Inambao, K. Wall, W. Kilembe, E. Karita, A. Ticachek, G. Streeb, I. Thior, J. Pulerwitz, E. Chomba, S. Allen Zambia Emory HIV research

More information

Distinguishing epidemiological dependent from treatment (resistance) dependent HIV mutations: Problem Statement

Distinguishing epidemiological dependent from treatment (resistance) dependent HIV mutations: Problem Statement Distinguishing epidemiological dependent from treatment (resistance) dependent HIV mutations: Problem Statement Leander Schietgat 1, Kristof Theys 2, Jan Ramon 1, Hendrik Blockeel 1, and Anne-Mieke Vandamme

More information

VIRAL HEPATITIS: SITUATION ANALYSIS AND PERSPECTIVES IN THE AFRICAN REGION. Report of the Secretariat. CONTENTS Paragraphs BACKGROUND...

VIRAL HEPATITIS: SITUATION ANALYSIS AND PERSPECTIVES IN THE AFRICAN REGION. Report of the Secretariat. CONTENTS Paragraphs BACKGROUND... 5 November 2014 REGIONAL COMMITTEE FOR AFRICA ORIGINAL: ENGLISH Sixty-fourth session Cotonou, Republic of Benin, 3 7 November 2014 Provisional agenda item 11 VIRAL HEPATITIS: SITUATION ANALYSIS AND PERSPECTIVES

More information

Steady Ready Go! teady Ready Go. Every day, young people aged years become infected with. Preventing HIV/AIDS in young people

Steady Ready Go! teady Ready Go. Every day, young people aged years become infected with. Preventing HIV/AIDS in young people teady Ready Go y Ready Preventing HIV/AIDS in young people Go Steady Ready Go! Evidence from developing countries on what works A summary of the WHO Technical Report Series No 938 Every day, 5 000 young

More information

A novel approach to estimation of the time to biomarker threshold: Applications to HIV

A novel approach to estimation of the time to biomarker threshold: Applications to HIV A novel approach to estimation of the time to biomarker threshold: Applications to HIV Pharmaceutical Statistics, Volume 15, Issue 6, Pages 541-549, November/December 2016 PSI Journal Club 22 March 2017

More information

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis BST227 Introduction to Statistical Genetics Lecture 4: Introduction to linkage and association analysis 1 Housekeeping Homework #1 due today Homework #2 posted (due Monday) Lab at 5:30PM today (FXB G13)

More information

Ch 18 Infectious Diseases Affecting Cardiovascular and Lymphatic Systems

Ch 18 Infectious Diseases Affecting Cardiovascular and Lymphatic Systems Ch 18 Infectious Diseases Affecting Cardiovascular and Lymphatic Systems Highlight Disease: Malaria World s dominant protozoal disease. Four species of Plasmodium: P. falciparum (malignant), P. vivax (begnin),

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (table). Pathogen GWASs in NHGRI GWAS catalog (https://www.genome.gov/26525384) as of April 15, 2014 Disease or Pathogen Reference Pubmed ID minimum observed p-value # reported

More information

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol HLA and antigen presentation Department of Immunology Charles University, 2nd Medical School University Hospital Motol MHC in adaptive immunity Characteristics Specificity Innate For structures shared

More information

Technical Bulletin No. 161

Technical Bulletin No. 161 CPAL Central Pennsylvania Alliance Laboratory Technical Bulletin No. 161 cobas 6800 HIV-1 Viral Load Assay - New Platform - June 1, 2017 Contact: Heather Habig, MLS (ASCP) CM, MB CM, 717-851-1422 Operations

More information

HIV: Pregnancy in Serodiscordant Couple. Dr Chow TS ID Clinic HPP

HIV: Pregnancy in Serodiscordant Couple. Dr Chow TS ID Clinic HPP HIV: Pregnancy in Serodiscordant Couple Dr Chow TS ID Clinic HPP Sexual Reproductive Health and Rights The recognition of the sexual and reproductive health and rights (SRHR) of all individuals and couples

More information

LTA Analysis of HapMap Genotype Data

LTA Analysis of HapMap Genotype Data LTA Analysis of HapMap Genotype Data Introduction. This supplement to Global variation in copy number in the human genome, by Redon et al., describes the details of the LTA analysis used to screen HapMap

More information

Sysmex Educational Enhancement and Development No

Sysmex Educational Enhancement and Development No SEED Haematology No 1 2015 Introduction to the basics of CD4 and HIV Viral Load Testing The purpose of this newsletter is to provide an introduction to the basics of the specific laboratory tests that

More information

Introduction to the Genetics of Complex Disease

Introduction to the Genetics of Complex Disease Introduction to the Genetics of Complex Disease Jeremiah M. Scharf, MD, PhD Departments of Neurology, Psychiatry and Center for Human Genetic Research Massachusetts General Hospital Breakthroughs in Genome

More information

Diagnostic Methods of HBV and HDV infections

Diagnostic Methods of HBV and HDV infections Diagnostic Methods of HBV and HDV infections Zohreh Sharifi,ph.D Blood Transfusion Research Center, High Institute for Research and Education in Transfusion Medicine Hepatitis B-laboratory diagnosis Detection

More information

Citation for published version (APA): Von Eije, K. J. (2009). RNAi based gene therapy for HIV-1, from bench to bedside

Citation for published version (APA): Von Eije, K. J. (2009). RNAi based gene therapy for HIV-1, from bench to bedside UvA-DARE (Digital Academic Repository) RNAi based gene therapy for HIV-1, from bench to bedside Von Eije, K.J. Link to publication Citation for published version (APA): Von Eije, K. J. (2009). RNAi based

More information

To provide you with the basic concepts of HIV prevention using HIV rapid tests combined with counselling.

To provide you with the basic concepts of HIV prevention using HIV rapid tests combined with counselling. Module 2 Integration of HIV Rapid Testing in HIV Prevention and Treatment Programs Purpose Pre-requisite Modules Learning Objectives To provide you with the basic concepts of HIV prevention using HIV rapid

More information

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics

ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics ACE ImmunoID Biomarker Discovery Solutions ACE ImmunoID Platform for Tumor Immunogenomics Precision Genomics for Immuno-Oncology Personalis, Inc. ACE ImmunoID When one biomarker doesn t tell the whole

More information