HLA complex genes in type 1 diabetes and other autoimmune diseases. Which genes are involved?

Similar documents
Significance of the MHC

Significance of the MHC

Significance of the MHC

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol

HLA and disease association

HLA and antigen presentation. Department of Immunology Charles University, 2nd Medical School University Hospital Motol

The Major Histocompatibility Complex (MHC)

Basic Immunology. Lecture 5 th and 6 th Recognition by MHC. Antigen presentation and MHC restriction

The major histocompatibility complex (MHC) is a group of genes that governs tumor and tissue transplantation between individuals of a species.

BDC Keystone Genetics Type 1 Diabetes. Immunology of diabetes book with Teaching Slides

Completing the CIBMTR Confirmation of HLA Typing Form (Form 2005)

Genetics and Genomics in Medicine Chapter 8 Questions

The Major Histocompatibility Complex

MHC class I MHC class II Structure of MHC antigens:

FULL PAPER Genotype effects and epistasis in type 1 diabetes and HLA-DQ trans dimer associations with disease

the HLA complex Hanna Mustaniemi,

AG MHC HLA APC Ii EPR TAP ABC CLIP TCR

Major Histocompatibility Complex (MHC) and T Cell Receptors

The MHC and Transplantation Brendan Clark. Transplant Immunology, St James s University Hospital, Leeds, UK

Dan Koller, Ph.D. Medical and Molecular Genetics

The Predisposition to Type 1 Diabetes Linked to the Human Leukocyte Antigen Complex Includes at Least One Non Class II Gene

ASSESSMENT OF THE RISK FOR TYPE 1 DIABETES MELLITUS CONFERRED BY HLA CLASS II GENES. Irina Durbală

Historical definition of Antigen. An antigen is a foreign substance that elicits the production of antibodies that specifically binds to the antigen.

10/18/2012. A primer in HLA: The who, what, how and why. What?

Principles of Adaptive Immunity

Antigen Presentation to T lymphocytes

IMMUNOLOGY. Elementary Knowledge of Major Histocompatibility Complex and HLA Typing

Two categories of immune response. immune response. infection. (adaptive) Later immune response. immune response

Alleles: the alternative forms of a gene found in different individuals. Allotypes or allomorphs: the different protein forms encoded by alleles

Profiling HLA motifs by large scale peptide sequencing Agilent Innovators Tour David K. Crockett ARUP Laboratories February 10, 2009

DEFINITIONS OF HISTOCOMPATIBILITY TYPING TERMS

Effects of Stratification in the Analysis of Affected-Sib-Pair Data: Benefits and Costs

Cover Page. The handle holds various files of this Leiden University dissertation.

Human Leukocyte Antigens and donor selection

General information. Cell mediated immunity. 455 LSA, Tuesday 11 to noon. Anytime after class.

The Human Major Histocompatibility Complex

IMMUNOGENETICS AND TRANSPLANTATION

HLA Disease Associations Methods Manual Version (July 25, 2011)

Nomenclature. HLA genetics in transplantation. HLA genetics in autoimmunity

[Some people are Rh positive and some are Rh negative whether they have the D antigen on the surface of their cells or not].

Histocompatibility antigens

COURSE: Medical Microbiology, MBIM 650/720 - Fall TOPIC: Antigen Processing, MHC Restriction, & Role of Thymus Lecture 12

Evidence of at least two type 1 diabetes susceptibility genes in the HLA complex distinct from HLA-DQB1, -DQA1 and DRB1

HLA Mismatches. Professor Steven GE Marsh. Anthony Nolan Research Institute EBMT Anthony Nolan Research Institute

Antigen presenting cells

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

T cell development October 28, Dan Stetson

Supplementary note: Comparison of deletion variants identified in this study and four earlier studies

IMMUNOGENETICS AND INTRODUCTION TO TRANSPLANTATION

Antigen processing and presentation. Monika Raulf

Key Concept B F. How do peptides get loaded onto the proper kind of MHC molecule?

B F. Location of MHC class I pockets termed B and F that bind P2 and P9 amino acid side chains of the peptide

A HLA-DRB supertype chart with potential overlapping peptide binding function

Autoimmune diseases. Autoimmune diseases. Autoantibodies. Autoimmune diseases relatively common

Introduction to the Genetics of Complex Disease

all of the above the ability to impart long term memory adaptive immunity all of the above bone marrow none of the above

HLA and more. Ilias I.N. Doxiadis. Geneva 03/04/2012.

Transplantation. Immunology Unit College of Medicine King Saud University

Indian Journal of Nephrology Indian J Nephrol 2001;11: 88-97

The Adaptive Immune Response. T-cells

Immunology - Lecture 2 Adaptive Immune System 1

Dr. Yi-chi M. Kong August 8, 2001 Benjamini. Ch. 19, Pgs Page 1 of 10 TRANSPLANTATION

Autoimmunity. Autoimmunity arises because of defects in central or peripheral tolerance of lymphocytes to selfantigens

How T cells recognize antigen. How T cells recognize antigen -concepts

Attribution: University of Michigan Medical School, Department of Microbiology and Immunology

Topic (Final-03): Immunologic Tolerance and Autoimmunity-Part II

T-cell receptor feature. Antibody/antigen interaction. Major Histocompatibility Complex. Antigen processing. Antigen presentation

FONS Nové sekvenační technologie vklinickédiagnostice?

25/10/2017. Clinical Relevance of the HLA System in Blood Transfusion. Outline of talk. Major Histocompatibility Complex

Insulin Resistance. Biol 405 Molecular Medicine

Part XI Type 1 Diabetes

What is Autoimmunity?

What is Autoimmunity?

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

2000 Oxford University Press Human Molecular Genetics, 2000, Vol. 9, No

The Major Histocompatibility Complex of Genes

Basel - 6 September J.-M. Tiercy National Reference Laboratory for Histocompatibility (LNRH) University Hospital Geneva

THE ASSOCIATION OF SINGLE NUCLEOTIDE POLYMORPHISMS IN INTRONIC REGIONS OF ISLET CELL AUTOANTIGEN 1 AND TYPE 1 DIABETES MELLITUS.

Cellular Pathology of immunological disorders

CELL BIOLOGY - CLUTCH CH THE IMMUNE SYSTEM.

Chapter 22: The Lymphatic System and Immunity

The Adaptive Immune Response: T lymphocytes and Their Functional Types *

ORIGINAL ARTICLE Conserved extended haplotypes discriminate HLA-DR3-homozygous Basque patients with type 1 diabetes mellitus and celiac disease

LESSON 2: THE ADAPTIVE IMMUNITY

MULTIFACTORIAL DISEASES. MG L-10 July 7 th 2014

SEX. Genetic Variation: The genetic substrate for natural selection. Sex: Sources of Genotypic Variation. Genetic Variation

Class I Ag processing. TAP= transporters associated with antigen processing Transport peptides into ER

HLA Complex Genetics & Biology

Nature Genetics: doi: /ng Supplementary Figure 1

Documentation of Changes to EFI Standards: v 5.6.1

Robert B. Colvin, M.D. Department of Pathology Massachusetts General Hospital Harvard Medical School

RAISON D ETRE OF THE IMMUNE SYSTEM:

Alternative splicing: increasing diversity in the proteomic world

Role of NMDP Repository in the Evolution of HLA Matching and Typing for Unrelated Donor HCT

National Disease Research Interchange Annual Progress Report: 2010 Formula Grant


Research: Genetics HLA class II gene associations in African American Type 1 diabetes reveal a protective HLA-DRB1*03 haplotype

T Cell Development. Xuefang Cao, MD, PhD. November 3, 2015

T ype 1 diabetes is a multifactorial autoimmune

Whole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis

Transcription:

Review 93 through extensive genome replication accompanied by chromosome fusions and frequent rearrangements. Genetics 150, 1217 1228 6 Lagercrantz, U. and Lydiate, D. (1996) Comparative genome mapping in Brassica. Genetics 144, 1903 1910 7 Leitch, I.J. and Bennett, M.D. (1997) Polyploidy in angiosperms. Trends Plant Sci. 2, 470 476 8 Wendel, J.F. (2000) Genome evolution in polyploids. Plant Mol. Biol. 42, 225 249 9 Mayer, K. et al. (1999) Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402, 769 777 10 Lin, X. et al. (1999) Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402, 761 768 11 Theologis, A. et al. (20000 Sequence and analysis of chromosome 1 of Arabidopsis thaliana. Nature 408, 816 820 12 Salanoubat, M. et al. (2000) Sequence and analysis of chromosome 3 of Arabidopsis thaliana. Nature 408, 820 822 13 Tabata, S. et al. (2000) Sequence and analysis of chromosome 5 of Arabidopsis thaliana. Nature 408, 823 826 14 Bancroft, I. (2000) Insights into the structural and functional evolution of plant genomes afforded by the nucleotide sequences of chromosomes 2 and 4 of Arabidopsis thaliana. Yeast 17, 1 5 15 The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796 815 16 Bevan, M.W. et al. (1998) Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis thaliana. Nature 391, 485 488 17 Blanc, G. et al. (2000) Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell 12, 1093 1101 18 van Dodeweerd, A-M. et al. (1999) Identification and analysis of homologous segments of the genomes of rice and Arabidopsis thaliana. Genome 42, 887 892 19 Ku, H-M. et al. (2000) Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl. Acad. Sci. U. S. A. 97, 9121 9126 20 Acarkan, A. et al. (2000) Comparative genome analysis reveals extensive conservation of genome organization for Arabidopsis thaliana and Capsella rubella. Plant J. 23, 55 62 21 O Neill, C. and Bancroft, I. (2000) Comparative physical mapping of segments of the genome of Brassica oleracea var alboglabra that are homologous to sequenced regions of the chromosomes 4 and 5 of Arabidopsis thaliana. Plant J. 23, 233 243 22 Song, K. et al. (1995) Rapid genome change in synthetic polyploids of Brassica and its implications for polyploid evolution. Proc. Natl. Acad. Sci. U. S. A. 92, 7719 7723 23 Michelmore, R.W. and Meyers, B.C. (1998) Clusters of resistance genes in plants evolve by divergent selection and a birth-and-death process. Genome Res. 8, 1113 1130 24 Noel, L. et al. (1999) Pronounced intraspecific haplotype divergence at the RPP5 complex disease resistance locus of Arabidopsis. Plant Cell 11, 2099 2111 25 Tarchini, R. et al. (2000) The complete sequence of 340 kb of DNA around the rice Adh1 Adh2 region reveals interrupted colinearity with maize chromosome 4. Plant Cell 12, 391 391 26 Pichersky, E. (1990) Nomad DNA A model for movement and duplication of DNA sequences in plant genomes. Plant Mol. Biol. 15, 437 448 27 Bennetzen, J.L. (2000) Transposable element contributions to plant gene and genome evolution. Plant Mol. Biol. 42, 251 269 28 Moore, G. et al. (1995) Grasses, line up and form a circle. Curr. Biol. 5, 737 739 29 Chen, M. et al. (1998) Sequence organization and conservation in sh2/a1-homologous regions of sorghum and rice. Genetics 148, 435 443 30 Tikhonov, A.P. et al. (1999) Colinearity and its exceptions in orthologous adh regions of maize and sorghum. Proc. Natl. Acad. Sci. U. S. A. 96, 7409 7414 31 Gaut, B.S. and Doebley, J.F. (1997) DNA sequence evidence for the segmental allotetraploid origin of maize. Proc. Natl. Acad. Sci. U. S. A. 94, 6809 6814 32 Vision, J. et al. (2000) The origins of genomic duplications in Arabidopsis. Science 290, 2114 2117 HLA complex genes in type 1 diabetes and other autoimmune diseases. Which genes are involved? Dag E. Undlien, Benedicte A. Lie and Erik Thorsby The predisposition to develop a majority of autoimmune diseases is associated with specific genes within the human leukocyte antigen (HLA) complex. However, it is frequently difficult to determine which of the many genes of the HLA complex are directly involved in the disease process. The main reasons for these difficulties are the complexity of associations where several HLA complex genes might be involved, and the strong linkage disequilibrium that exists between the genes in this complex. The latter phenomenon leads to secondary disease associations, or what has been called hitchhiking polymorphisms. Here, we give an overview of the complexity of HLA associations in autoimmune disease, focusing on type 1 diabetes and trying to answer the question: how many and which HLA genes are directly involved? D.E. Undlien* B.A. Lie E. Thorsby Institute of Immunology, The National Hospital and University of Oslo, N-0027 Oslo, Norway. *e-mail: d.e.undlien@rh.uio.no Type 1 diabetes, rheumatoid arthritis, multiple sclerosis and Graves disease are all autoimmune diseases, to name but a few, probably resulting from an inappropriate immune response to self-proteins. Apart from being characterized by loss of immunological tolerance to self, these diseases share other important characteristics. First, they are all complex diseases where multiple genes and environmental factors act in concert in their aetiology. Second, at least one of the genetic factors involved in the development of these diseases is located in the HLA complex on chromosome 6(Ref. 1). The genes within this complex that are involved in disease are frequently difficult to identify. In this article, we focus on progress made in establishing which HLA genes are directly involved and discuss methodological aspects that are important to consider when studying the effects of this gene complex. We focus directly on type 1 diabetes because studies of this disease have implications for the understanding of other HLA associated diseases. Type 1 diabetes Diabetes mellitus is a group of heterogeneous disorders characterized by hyperglycaemia (high blood sugar). The incidence of these disorders is increasing worldwide. 0168-9525/01/$ see front matter 2001 Elsevier Science Ltd. All rights reserved. PII: S0168-9525(00)02180-6

94 Review Fig. 1. The HLA complex and some of its genes. Only a small number of genes in this gene complex are shown. There are a total of 128 genes in the HLA complex (not including the extended HLA class I complex) of which 40% are assumed to have an immune function. The genes encoding class I and II molecules are depicted in red, other genes in yellow. C2 and C4 are complement genes, TNF is the gene encoding tumor necrosis factor. The provisional borders of the extended HLA complex are the HSET gene and the hemochromatosis gene HFE. Two microsatellites discussed in this article are also depicted. HSET Extended class II DPB1 DPA1 } DQB1 DQA1 } Class II DRB1 DRA } DP DQ DR D6S273 C2 TNF Class III Type 1 diabetes (previously called insulin-dependent diabetes mellitus; IDDM) can affect people in all age groups. It is a multifactorial disease involving multiple genes, with unidentified environmental factors contributing to the pathogenesis 2. Through what is believed to be an autoimmune process, the disease results in the destruction of insulin-producing bcells in the pancreas, with concomitant reduction in endogenous insulin production. Epidemiological evidence for a genetic component to the disease comes from, among others, studies of twins that show a concordance rate in monozygotic twins of 30 70% (i.e. if one twin is affected, the second twin has a 30 70% chance of developing the disease), which is higher than in dizygotic twins (approximately 10%) and in siblings (6%) 3,4. The incidence of the disease has large geographical variations, which might be caused by differences in frequencies of HLA encoded genetic susceptibility factors between different ethnic groups 5. It is clear that non-hla genes are also involved. Many linkage studies are, however, in disagreement and molecular mechanisms of non-hla associations are mostly unknown 6,7. The emerging picture is that there is one gene complex harbouring the major predisposing genes, the HLA complex, plus several minor predisposing genes spread throughout the genome. The HLA complex can explain 35 50% of the familial clustering of the disease 8. However, the HLA complex contains at least 128 genes 9, and it is difficult to determine which gene or genes are responsible. The HLA complex The HLA complex is the major histocompatibility complex (MHC) in human and its complete sequence was presented in 1999(Ref. 9). This, a milestone in our understanding of this complex, should facilitate further studies aimed at dissecting the complex nature of HLA associations with several complex diseases including type 1 diabetes. The HLA complex C4 B Chromosome 6 C E Extended HLA complex A Class I G F D6S223 HFE Extended class I 0.3 Mb 0.9 Mb 0.7 Mb 1.8 Mb 4 Mb is a densely packed gene cluster containing at least 128 genes (not including the extended class I region where the number of genes is not known in detail) located on the short arm of chromosome 6 (Fig. 1). Approximately 40% of the genes within this cluster have an immune function. Originally, the HLA complex was considered to extend from HLA-DP to HLA-E. However, there are related genes on both the telomeric and centromeric side of these borders, leading to the concept of the extended HLA complex. This includes the extended class I and II regions where the centromeric border is the HSET gene and the provisional telomeric border is the haemochromatosis gene HFE. Within the HLA complex there are the genes encoding the classical HLA class I (HLA-A, -B, -C) and class II (DR, DQ and DP) molecules, which are peptide-presenting molecules for T lymphocytes (Box 1). These genes are the most polymorphic genes known in humans (HLA- B has close to 400 characterized alleles). In addition to their extreme polymorphism, strong linkage disequilibrium (LD) between neighbouring genes is a hallmark. LD refers to the non-random assortment of alleles at neighbouring loci (Box 2) 10,11. The fact that genes tend to be inherited as conserved blocks of linked alleles called haplotypes, has important implications for mapping of disease-associated genes. For example, there are several diseases positively associated with the HLA-DQ2-DR3-B8 haplotype e.g. type 1 diabetes, celiac disease and Graves disease (see Box 3 for nomenclature). In such diseases both the frequency of the DQA1*0501 and DQB1*0201 alleles (encoding the DQ2 molecule), the DRB1*03 allele (encoding DR3) and the B*08 allele (encoding B8) will tend to be increased among patients compared with controls. It is then very difficult to determine which of these alleles are directly involved in the predisposition and which are only secondarily associated because of LD with the directly associated allele.

Review 95 Box 1. Function of the classical HLA class I and II molecules HLA molecules are located in the cell membrane and are very similar in their three-dimensional structure. The HLA class I molecules (HLA-A, -B and C) These consist of an a-chain encoded by the respective polymorphic class I genes and a b-chain (b 2 -microglobulin) encoded by a non-polymorphic gene on chromosome 15. Class I molecules are expressed on almost all nucleated cells. Their most important function is to present peptide fragments predominantly derived from endogenous, cytosolic proteins to CD8 + cytotoxic T lymphocytes. HLA class I molecules are synthesized and transported to the endoplasmic reticulum where they encounter peptides resulting from degraded cytosolic proteins (including viral proteins if the cell is infected). Different HLA molecules bind to different sets of peptides. Peptides that fit into the peptide-binding groove of the particular HLA class I molecule can bind non-covalently, and the complex of HLA class I molecule and peptide is then transported to the cell membrane where the HLA class I peptide complex can encounter CD8 + cytotoxic T cells. This can result in cytotoxic lysis of the virally infected cell by the T lymphocyte. The HLA class II molecules (HLA-DR, -DQ and DP) These are made up of an a- and a b-chain encoded by two separate polymorphic genes in the HLA complex; for example, DQA1 encodes the a-chain and DQB1 encodes the b-chain of the DQ molecule (DRA1 is an exception as this gene is virtually non-polymorphic). Under normal circumstances class II molecules are expressed only on specialized antigen-presenting cells such as dendritic cells, B lymphocytes, macrophages and thymic epithelial cells. Their main function is to present peptides to CD4 + helper T lymphocytes. Extracellular proteins are endocytosed or phagocytosed and degraded in endosomes/lysosomes. The HLA class II molecules are routed to the endosomal compartment where they can bind to peptide fragments if the peptides fit in the peptidebinding groove of the particular HLA class II molecule. CD4 + T lymphocyte α β T-cell receptor (TCR) CD4 Peptide HLA class II (DR, DQ, DP) Antigenpresenting cell The problem of LD will probably not be unique to HLA. Clusters of genes of similar function are found at several places in the genome. It is probable that in several complex diseases there might be loci linked with each other and involved in disease pathogenesis. α β HLA-DRB1, -DQA1 and -DQB1 genes are directly involved in the predisposition to type 1 diabetes Type 1 diabetes was initially associated with some HLA class I alleles, namely the alleles encoding the B8 and B15 molecules 12,13. Later, a stronger association with the HLA class II alleles encoding DR3 and DR4 was found, and it was concluded that the class I associations observed previously were secondary to LD with these high-risk DR alleles 14,15. Similarly, B7 was found to be decreased among type 1 diabetes patients, and later this negative association was found to be even stronger for DR2 which is in LD with B7 (Ref. 16). Finally, in 1983, Owerbach et al. 17 observed that the distribution of a restriction-fragment length polymorphism (RFLP) in an HLA class II subregion, later identified as the DQ subregion, was significantly different in type 1 diabetes patients compared with controls matched for alleles at the DR locus. This implied that the observed associations between particular DR alleles and type 1 diabetes could be secondary to LD with high-risk DQ alleles. This was further supported by the finding that type 1 diabetes strongly correlates with HLA alleles not encoding aspartic acid at position 57 of the DQbchain 18. Aspartic acid in this position was later shown to be important for the shape of the peptide-binding groove of HLA-DQ molecules, and hence, have implications for which antigenic peptides can be bound by these molecules. It has become generally accepted that the genes encoding the peptide-presenting DQ molecules (i.e. the DQA1 and DQB1 genes) are directly involved in the development of type 1 diabetes. First, the HLA associations with diabetes are consistently strongest for the DQ subregion (or, in rarer instances, the DR subregion). Second, transracial studies show that similar alleles of DQA1 and DQB1 are associated with the disease in all populations. Thus, despite the fact that the composition of the haplotypes can vary considerably between various ethnic groups, particular combinations of alleles of the DQ genes are the common denominators of susceptibility to type 1 diabetes 19. The DQ involvement in type 1 diabetes is complex, as some DQ haplotypes confer susceptibility, others protection and still others are neutral (Table 1). Third, there is a clear correlation between particular DQ (a1,b1) heterodimers and type 1 diabetes, irrespective of whether the a- and b-chains are encoded by genes on the same chromosome or haplotype (cis) or on different chromosomes (trans) 20. Even though there is general agreement that DQ genes are directly involved in type 1 diabetes, there is substantial evidence that the DQ genes cannot explain all the genetic susceptibility encoded within the HLA complex. This has been made particularly clear by studies of the contribution of DR4 subtypes to the development of the disease 21 23 where different DR4 subtypes were found to be distributed differently on DQ8 (DQA1*03-DQB1*0302) haplotypes between patients and controls (Fig. 2). Studies of other haplotypes also indicated that DR

96 Review Box 2. Linkage disequilibrium Linkage disequilibrium (LD) refers to the nonrandom assortment of alleles at neighbouring loci at the population level. Thus, when the occurrence of pairs of specific alleles at different loci on the same haplotype is not independent, the deviation from independence is termed linkage disequilibrium (e.g. the alleles encoding DQ2 occur together with the alleles encoding DR3 more frequently than expected by chance). Several statistical measures of LD have been developed; one of the simplest being: D = frequency of AB -frequency of A frequency of B, where A is a particular allele at locus 1 and B is a particular allele at locus 2. Today, LD is generally taken to mean allelic association caused by tight linkage. In more general terms, LD can refer to any form of allelic association, including associations caused by admixture. If, for example, two genetically different populations are mixed and these two populations have different allele frequencies at a given locus and also have different frequencies of a particular disease, the gene in question will tend to be associated, but neither linked to nor involved in the pathogenesis of the disease. This phenomenon of spurious associations caused by admixture is also referred to as population stratification. Methods to look for new susceptibility genes eliminating secondary associations caused by LD with linked directly involved genes Stratified case-control analysis Comparing cases and controls that are matched for known disease susceptibility genes in the case of type 1 diabetes, DQ (DQA1, DQB1) and DR (DRB1) genes. Homozygous parent test Linkage test using affected sib pair (ASP) analysis only including parents being homozygous for DR and DQ genes in the analysis. Homozygous parent transmission disequilibrium test (TDT) TDT only includes transmissions from DR-DQ homozygous parents in the analysis (Fig. 3). Conditional extended transmission disequilibrium test (CETDT) CETDT is an extended TDT analysis conditioned (stratified) on parental DR-DQ genotypes. Box 3. HLA nomenclature The history of the discovery of the HLA molecules and the HLA complex genes are reflected in the nomenclature for factors in the HLA system. Originally, HLA specificities were defined serologically by antibodies or by T-cell specificities and only later were they genomically defined. The table gives some examples. The serological specificity DR4 can be encoded by several different alleles, reflecting the much higher resolution of DNA-based HLA typing. HLA locus Serological specificity Allele DRB1 DR4 DRB1*0401 DR4 DRB1*0402 DR4 DRB1*0403 DR4 DRB1*0404 DR4 DRB1*0408 DR3 DRB1*0301 DQA1 DQA1*0501 DQA1*0401 DQB1 DQ2 DQB1*0201 DQ8 DQB1*0302 B B8 B*0801 HLA genes The names of genomically defined alleles consist in general of the locus name followed by an asterisk and 4 5 digits. The first two digits frequently reflect the previous serological nomenclature. HLA molecules HLA molecules can be named after their a- and b- chains or as an abbreviated form by the serological specificity (epitope) which is imparted by the molecule. For example, the DQ molecule encoded by the DQA1 allele DQA1*0501 (encoding the DQa chain DQa*0501) and the DQB1 allele DQB1*0201 (encoding the DQb chain DQb1*0201) is designated DQ(a1*0501,b1*0201) or, as a short form, DQ2. HLA haplotypes Particular alleles of the different HLA genes are frequently inherited as conserved blocks called haplotypes. Such haplotypes are listed using the individual alleles of the various genes separated by hyphens. To avoid lengthy designations serological specificities are used in a much of the literature instead of the full genomic allele designations even if the alleles are determined genomically. For example, the HLA-DQB1*0201-DQA1*0501- DRB1*03-B*0801 haplotype is often written as the HLA-DQ2-DR3-B8 haplotype. In this article, the latter designation is used.

Review 97 Table 1. Some HLA-DRB1-DQA1-DQB1 haplotypes and their impact on type 1 diabetes susceptibility DRB1-DQA1-DQB1 haplotype 04-03-0302 Susceptible 03-0501-0201 Susceptible 01-0101-0501 Neutral a 0102-0602 Protective a This haplotype might be weakly susceptible in some populations, and in others it is weakly protective relative to other HLA haplotypes depending on haplotype frequencies in the background populations. genes are probably directly involved in predisposition to the disease 24. The evidence for a primary role of some DRB1 alleles is, however, not as conclusive as for DQA1 and DQB1 because the possibility exists that DRB1 genes might only be a marker for other HLA complex gene(s) directly involved in the predisposition. However, the genes surrounding DRB1 do, in general, display weaker associations with type 1 diabetes than DRB1 when comparing DQ-matched cases and controls. Also, the DRB1 associations with type 1 diabetes seem to be constant through different ethnic groups. In addition, the DRB1 and DQB1 genes are virtually identical, and the three-dimensional structure of DR and DQ molecules is thought to be very similar. Further evidence for a direct involvement of MHC class II genes encoding peptide-presenting molecules comes from animal models. In the non-obese diabetic (NOD) mouse, an animal model for type 1 diabetes, a role for the mouse homologues of the DQ molecules (I-A) and the DR molecules (I-E) has been established DQB1 DQA1 DRB1 0302 0302 0302 0302 DQ8 03 0401 03 0405 03 0403 03 0406 12 kb 68 kb Susceptible Susceptible Protective Protective Fig. 2. DR4 subtypes in type 1 diabetes. The risk encoded by DQ8 haplotypes (DQA1*03 DQB1*0302) is influenced by which DRB1*04 subtype is carried. For example, DQ8 haplotypes carrying the DRB1*0401 or 0405 alleles are susceptible, whereas those carrying DRB1*0403 or 0406 are protective. by congenic mapping and transgenic models, providing functional evidence that the DRB1, DQA1 and DQB1 genes are directly involved in type 1 diabetes. Mice transgenic for the susceptible DR3- and DQ8-encoding genes develop diabetes, whereas identical mice made transgenic for the genes encoding the protective DQ6 molecule do not 25. Analytical methods to look for additional disease susceptibility genes in the HLA complex Several lessons should be learnt from the history of HLA associations in type 1 diabetes. First, one must expect to find associations with virtually all genes in the HLA complex because of LD with high-risk DR and DQ alleles. The important question is: are there other HLA complex genes that have effects independent or additional to the involved DR and DQ genes? To address this question, LD must be taken into account (Box 2). One has to make sure that the potential associations observed are not secondary to LD with alleles at the primary susceptibility loci (DRB1, DQA1, DQB1) before one can conclude that such disease associations are caused by independent additional genetic risk factors in the HLA complex. A method frequently used to try to eliminate secondary associations caused by LD with DQ or DR alleles is to calculate some measurement of LD, of which there exist several methods 26, between alleles at the established disease loci (i.e. DR and DQ) and the locus being tested for disease association. Absence of statistically significant LD between DR and/or DQ genes and the associated gene being studied has frequently been taken as evidence for an association not being caused by LD with alleles at the established disease-associated loci (DR and DQ). Our belief is that this is not sufficient to eliminate subtle effects of LD. Several weak and statistically insignificant LDs might still confound the data significantly 27. Application of this procedure probably underlies several unreproduced reports of independent disease associations with additional genes in the HLA complex. Methods to eliminate the effects of LD There are methods that efficiently eliminate LD with the disease involved class II genes. The most widely used is a stratified case-control analysis where patients and controls are matched for the genes involved in disease; that is, DR and DQ. A problem has been that it requires a very large number of random controls to get a sufficient number of controls carrying the same HLA class II alleles as the patients. In addition, case-control analysis is received with a great deal of scepticism, mainly because associations might be caused by population stratification. However, case-control analysis is statistically powerful and avoids the need to collect families. In addition, there are an increasing number of controls available, and methods aimed at detecting population stratification are being developed 28. Hence, provided

98 Review Father DR1/4 (excluded from analysis) Fig. 3. The homozygous parent transmission disequilibrium test (TDT). The TDT tests transmission of marker alleles (in this case allele 3 of marker D6S2223) to probands from parents being heterozygous for the marker D6S2223. In the homozygous parent TDT, only transmissions from parents homozygous at a locus known to be associated with disease (in this case DR-DQ) are included. The father in this case is not included in the analysis because he is not homozygous for DR-DQ. The transmission of D6S2223*3 to probands from DR3-DQ2 homozygous parents is significantly different from the expected 50%. This implies that D6S2223 is in linkage disequilibrium (LD) with a disease susceptibility gene distinct from DR-DQ.? P 3 2 A Mother DR3 DQ2 homozygous 3 2 A Haplotype of affected individual 3 2 B B not transmitted DR DQ Microsatellite the patients and controls are fairly homogenous within ethnic groups, HLA class II-matched casecontrol analysis still seems a good choice. Alternatives to case-control analysis are several family-based tests that efficiently eliminate the effect of LD. In most cases, these methods are also insensitive to population stratification. The homozygous parent test as originally described 29 is an affected sibpair (ASP)- based linkage analysis that only includes parents homozygous for the HLA genes involved in disease. An extension to this test that also allows testing for association is the homozygous parent transmission disequilibrium test (homozygous parent TDT; Fig. 3) 30. This analysis is also limited to transmissions from homozygous parents. However, rather than an ASP analysis, it employs a TDT analysis 31. Both the homozygous parent test and the homozygous parent TDT are limited by the fact that only some of the family data can be used (it is limited to DR-DQ homozygous parents). However, a strength is that it easily accommodates haplotype specific interactions because it analyses haplotypes separately. Another recently developed method is the conditional extended TDT (CETDT) described by Koeleman et al. 32. This method controls for LD by stratifying the TDT analysis on the parental DR-DQ genotype. Compared with the homozygous parent TDT, it has the advantage that it includes information from both parents. A disadvantage is that it does not detect haplotype specific effects or allelic heterogeneity so easily. Are HLA complex genes other than DR and DQ involved in the predisposition to type 1 diabetes? In the past few years, there have been several studies applying methods able to eliminate the effects of LD to DR and DQ genes. These studies have demonstrated that there are indeed additional genes in the HLA complex involved. Among the first to show this were Robinson et al. 29 who applied the homozygous parent test to families with type 1 diabetes. They found evidence for heterogeneity in terms of genetic risk on different DR3 and DR4 haplotypes implying the presence of genetic susceptibility factors in addition to DR and DQ on these haplotypes. The results observed for DR4 haplotypes can probably be explained by the wellestablished effect of different DR4 subtypes on the risk of type 1 diabetes. The results for DR3 haplotypes, however, suggest the presence of additional HLA linked type 1 diabetes susceptibility genes besides the DR and DQ genes. Because the method used was based upon linkage analysis (ASP), the region where such a novel susceptibility gene could be located is large. Susceptibility genes in the HLA class I extended class I region We have systematically assessed microsatellite markers spanning the extended HLA complex 9 using both a large number of HLA class II matched cases and controls as well as the homozygous parent TDT in large family datasets from Norway, Denmark and the UK 30. We found clear evidence that a gene in LD with a particular allele (3) at microsatellite D6S2223 modifies the risk encoded by the DR3-DQ2 haplotype both in case-control analysis and in TDT analysis (Fig. 3). The transmission of D6S2223*3 from DR3 homozygous parents to their diabetic child was significantly less than the expected 50%. This strongly implies that there is a gene in LD with marker D6S2223 that influences the pathogenesis of type 1 diabetes. A study by Herr et al. 33, using data partly overlapping the UK data that we used, applied a CETDT analysis of markers spanning the HLA complex and found only one marker outside the class II region that showed significant evidence for independent association to type 1 diabetes, namely D6S2223. In this study, they also showed using microsatellite markers that the region of maximum association, identified the region harbouring the major type 1 diabetes susceptibility genes DR and DQ with great precision. On the basis of this study and analysis of multilocus haplotypes (B.A. Lie et al., unpublished), we believe that the putative susceptibility gene is located in the extended class I region in the vicinity of D6S2223. Some reports have also suggested that HLA class I genes could be associated with type 1 diabetes independently of DR and DQ genes 34. If these associations are confirmed, they will need to be correlated to the associations at D6S2223 to see whether these associations are independent of LD between D6S2223 and class I genes.

Review 99 Class II Extended class II Class III Class I Extended class I DP? DR DQ D6S273? B? A? D6S2223 Type 1 diabetes associations Mouse MHC (chr. 17) I-E I-A HLA complex (chr. 6) Fig. 4. Major histocompatibility complex (MHC) genes involved in type 1 diabetes. Comparative maps of the MHC in human and mouse. In both man and mouse, the homologous DR/I-E and DQ/I-A class II genes are known to be involved (indicated in red). In mice, it has also been demonstrated that, both centromeric 39 and telomeric 40 of these genes, there are susceptibilty genes for type 1 diabetes (indicated by broad orange arrows). In humans, reproducible evidence for associations independent of DR-DQ has been found for marker D6S2223. In addition, numerous other genes and markers such as HLA-DP, -A, -B and D6S273 have been found in some studies to be associated with disease; however, these associations need to be confirmed. Susceptibility genes in the class III region There have been two studies suggesting that the class III marker D6S273 is associated with development of type 1 diabetes independently of the DR and DQ genes 30,35. However, different alleles were found associated in the different populations and it has not been established clearly whether this marker is associated independently of the association observed for D6S2223 (Ref. 30). Further studies are needed to clarify this. Additional susceptibility genes in the class II region There have been contradictory reports on a potential role of the genes encoding the DP molecules (DPA1, DPB1) 36,37, and this issue is not yet settled. The peptide transporters TAP1 and TAP2 have been analysed in several studies, and, although some initial reports suggested an independent role of the TAP2 gene, later studies suggested that this was a false positive result caused by statistically insignificant LDs confounding the data. Hence, current evidence suggests that the antigenprocessing genes (TAP1, TAP2, LMP2, LMP7) in the class II region are not directly involved in type 1 diabetes 27,38. Animal models It has been clearly established in the NOD mouse that there are other genes linked to the MHC (in addition to the I-A and I-E genes) that affect disease susceptibility 39,40 (Fig. 4). Conclusion The HLA complex is characterized by a high gene density, genetic complexity and strong LD. Several common complex genetic diseases, in particular most autoimmune diseases, have a major genetic component encoded within this complex. Current evidence suggests that, for type 1 diabetes, there are at least four genes involved. The class II genes DQA1, DQB1, DRB1 are the most important ones, but in addition there is at least one unidentified gene most probably located telomeric to the class I/extended class I region. So far, LD has hampered the detection of all genetic susceptibility factors within the HLA complex. However, the availability of increasing numbers of families, cases and controls, as well as methodological improvements to eliminate the unwanted secondary effects of LD give reason to believe that we will soon be able to identify all the HLA complex genes directly involved in the predisposition to develop type 1 diabetes and other autoimmune diseases. The currently running disease component of the 13th International Histocompatibility Workshop has this as one of its main aims (http://www.ihwg.org). Such studies will also be beneficial for the analysis of other clusters of disease-associated genes in genetically complex diseases. References 1 Thorsby, E. (1997) HLA associated diseases. Anniversary review article. Hum. Immunol. 53, 1 11 2 Tisch, R. and McDevitt, H. (1996) Insulindependent diabetes mellitus. Cell 85, 291 297 3 Kyvik, K.O. et al. (1995) Concordance rates of insulin dependent diabetes mellitus: a population based study of young Danish twins. Br. Med. J. 311, 913 917 4 Thomson, G. et al (1988) Genetic heterogeneity, modes of inheritance, and risk estimates for a joint study of Caucasians with insulin-dependent diabetes mellitus. Am. J. Hum. Genet. 43, 799 816 5 Dorman, J.S. et al. (1990) Worldwide differences in the incidence of type I diabetes are associated with amino acid variation at position 57 of the HLA-DQ beta chain. Proc. Natl. Acad. Sci. U. S. A. 87, 7370 7374 6 Mein, C.A. et al.(1998) A search for type 1 diabetes susceptibility genes in families from the United Kingdom. Nat.Genet. 19, 297 300 7 Concannon, P. et al. (1998) A second-generation screen of the human genome for susceptibility to insulin-dependent diabetes mellitus. Nat. Genet. 19, 292 296 8 Risch, N. (1987) Assessing the role of HLA-linked and unlinked determinants of disease. Am. J. Hum. Genet. 40, 1 14 9 The MHC Sequencing Consortium (1999) Complete sequence and gene map of a human major histocompatibility complex. Nature 401, 921 923 10 Tomlinson, I.P. and Bodmer, W.F. (1995) The HLA system and the analysis of multifactorial genetic disease. Trends Genet. 11, 493 498 11 Jorde, L.B. (1995) Linkage disequilibrium as a gene-mapping tool. Am. J. Hum. Genet. 56, 11 14 12 Singal, D.P. and Blajchman, M.A. (1973) Histocompatibility (HL-A) antigens, lymphocytotoxic antibodies and tissue antibodies in patients with diabetes mellitus. Diabetes 22, 429 432 13 Nerup, J. et al. (1974) HL-A antigens and diabetes mellitus. Lancet 2, 864 866 14 Thomsen, M. et al. (1975) MLC typing in juvenile diabetes mellitus and idiopathic Addison s disease. Transplant. Rev. 22, 125 147 15 Nerup, J. et al. (1977) HLA and endocrine diseases. In HLA and disease (Dausset, J. and Svejgaard, A., eds), pp. 149 167, Munksgaard 16 Farid, N.R. et al. (1979) HLA-D-related (DRw) antigens in juvenile diabetes mellitus. Diabetes 28, 552 557 17 Owerbach, D. et al. (1983) HLA-D region betachain DNA endonuclease fragments differ between HLA-DR identical healthy and insulin-dependent diabetic individuals. Nature 303, 815 817 18 Todd, J.A. et al. (1987) HLA-DQ beta gene contributes to susceptibility and resistance to insulin-dependent diabetes mellitus. Nature 329, 599 604 19 She, J.X. (1996) Susceptibility to type 1 diabetes: HLA-DQ and DR revisited. Immunol. Today 17, 323 329 20 Thorsby, E. and Ronningen, K.S. (1993) Particular HLA-DQ molecules play a dominant role in determining susceptibility or resistance to type 1 (insulin-dependent) diabetes mellitus. Diabetologia 36, 371 377

100 Review 21 Caillat-Zucman, S. et al. (1992) Age-dependent HLA genetic heterogeneity of type 1 insulindependent diabetes mellitus. J. Clin. Invest. 90, 2242 2250 22 Cucca, F. et al. (1995) The distribution of DR4 haplotypes in Sardinia suggests a primary association of type 1 diabetes with DRB1 and DQB1 loci. Hum. Immunol. 43, 301 308 23 Undlien, D.E. et al. (1997) HLA encoded genetic predisposition in insulin-dependent diabetes mellitus (IDDM). DR4 subtypes may be associated with different degrees of protection. Diabetes 46, 143 149 24 Undlien, D.E. et al. (1999) HLA associations in type 1 diabetes among patients not carrying highrisk DR3-DQ2 or DR4-DQ8 haplotypes. Tissue Antigens 54, 543 551 25 Abraham, R.S. et al. (2000) Co-expression of HLA DR3 and DQ8 results in the development of spontaneous insulitis and loss of tolerance to GAD65 in transgenic mice. Diabetes 49, 548 554 26 Devlin, B. and Risch, N. (1995) A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311 322 27 Caillat-Zucman, S. et al. (1995) Family study of linkage disequilibrium between TAP2 transporter and HLA class II genes. Absence of TAP2 contribution to association with insulin-dependent diabetes mellitus. Hum. Immunol. 44, 80 87 28 Pritchard, J.K. et al. (2000) Association mapping in structured populations. Am. J. Hum. Genet. 67, 170 181 29 Robinson, W.P. et al. (1993) Homozygous parent affected sib pair method for detecting disease predisposing variants: application to insulin dependent diabetes mellitus. Genet. Epidemiol. 10, 273 288 30 Lie, B.A. et al. (1999) The predisposition to type 1 diabetes linked to the HLA complex (IDDM1) includes at least one non-class II gene. Am. J. Hum. Genet. 64, 793 800 31 Spielman, R.S. et al. (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52, 506 516 32 Koeleman, B.P. et al. (2000) Adaptation of the extended transmission/disequilibrium test to distinguish disease associations of multiple loci: the conditional extended transmission/ disequilibrium test. Ann. Hum. Genet. 64, 207 213 33 Herr, M. et al. (2000) Evaluation of fine mapping strategies for a multifactorial disease locus: systematic linkage and association analysis of IDDM1 in the HLA region on chromosome 6p21. Hum. Mol. Genet. 9, 1291 1301 34 Nejentsev, S. et al. (2000) The effect of HLA-B allele on the IDDM risk defined by DRB1*04 subtypes and DQB1*0302. Diabetes 46, 1888 1892 35 Moghaddam, P.H. et al. (1997) TNFa microsatellite polymorphism modulates the risk of IDDM in Caucasians with the high-risk genotype HLA DQA1*0501- DQB1*0201/DQA1*0301-DQB1*0302. Belgian Diabetes Registry. Diabetes 46, 1514 1515 36 Lie, B.A. et al. (1997) HLA associations in insulindependent diabetes mellitus: No independent association to DP genes. Hum. Immunol. 55, 170 175 37 Noble, J.A. et al. (2000) The HLA class II locus DPB1 can influence susceptibility to type 1 diabetes. Diabetes 49, 121 125 38 Undlien, D.E. et al. (1997) No independent association of LMP2 and LMP7 polymorphisms with susceptibility to IDDM. Diabetes 46, 307 312 39 Hattori, M. et al. (1999) Cutting edge: homologous recombination of the MHC class I K region defines new MHC-linked diabetogenic susceptibility gene(s) in nonobese diabetic mice. J. Immunol. 163, 1721 1724 40 Mathews, C.E. et al. (2000) Reevaluation of the major histocompatibility complex genes of the NOD-progenitor CTS/Shi strain. Diabetes 49, 131 134 Alternative splicing: increasing diversity in the proteomic world Brenton R. Graveley How can the genome of Drosophila melanogaster contain fewer genes than the undoubtedly simpler organism Caenorhabditis elegans? The answer must lie within their proteomes. It is becoming clear that alternative splicing has an extremely important role in expanding protein diversity and might therefore partially underlie the apparent discrepancy between gene number and organismal complexity. Alternative splicing can generate more transcripts from a single gene than the number of genes in an entire genome. However, for the vast majority of alternative splicing events, the functional significance is unknown. Developing a full catalog of alternatively spliced transcripts and determining each of their functions will be a major challenge of the upcoming proteomic era. B.R. Graveley Dept of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, CT 06030, USA. e-mail: graveley@ neuron.uchc.edu A major undertaking of the post-genomic era will be the description and functional characterization of the full complement of proteins (i.e. the proteome) expressed by an organism. DNA recombination, RNA editing and alternative splicing make this task more difficult than it first appears as these processes increase the number of proteins that can be synthesized from each gene. As a result, the number of proteins in the proteome is by no means equivalent to the number of genes, but can exceed it by literally orders of magnitude. The mechanism most widely used to enhance protein diversity, with regard to the number of genes affected and the breadth of organisms it occurs in, is alternative splicing (Box 1). Alternative splicing can generate multiple transcripts encoding proteins with subtle or opposing functional differences that can have profound biological consequences. In this article, I will discuss the prevalence of alternative splicing and the amount of diversity it can create, provide examples of several complex alternative splicing events and, finally, discuss the issue of how much of the observed alternative splicing actually represents errors or noise in the system. Readers should refer to an excellent recent review by Smith and Valcárcel 1 for more information about the biochemical mechanisms of alternative splicing. It s all around us All eukaryotes contain INTRONS (see Glossary) in at least some of their genes although the number can vary considerably from organism to organism. For example, only 250 out of the 6000 genes in Saccharomyces cerevisiae contain introns 2, whereas most of the estimated 35 000 human genes 3,4 are thought to contain introns. But how many of these genes encode transcripts that are alternatively 0168-9525/01/$ see front matter 2001 Elsevier Science Ltd. All rights reserved. PII: S0168-9525(00)02176-4