Genetic aspects of Multiple Sclerosis

Similar documents
Dan Koller, Ph.D. Medical and Molecular Genetics

Genetics and Genomics in Medicine Chapter 8 Questions

MEDIA BACKGROUNDER. Multiple Sclerosis: A serious and unpredictable neurological disease

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

ORIGINAL CONTRIBUTION. Multiple Sclerosis That Is Progressive From the Time of Onset

Introduction to the Genetics of Complex Disease

Single Gene (Monogenic) Disorders. Mendelian Inheritance: Definitions. Mendelian Inheritance: Definitions

Nonparametric Linkage Analysis. Nonparametric Linkage Analysis

PATIENTS WITH MULTIPLE SCLEROSIS

MULTIFACTORIAL DISEASES. MG L-10 July 7 th 2014

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

Balance between herpes viruses and immunosuppression after lung transplantation Verschuuren, Erik A.M.

Complex Multifactorial Genetic Diseases

Pedigree Analysis Why do Pedigrees? Goals of Pedigree Analysis Basic Symbols More Symbols Y-Linked Inheritance

Citation for published version (APA): Bijl, M. (2001). Apoptosis and autoantibodies in systemic lupus erythematosus Groningen: s.n.

University of Groningen. Functional outcome after a spinal fracture Post, Richard Bernardus

An Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018

Citation for published version (APA): Portman, A. T. (2005). Parkinson's Disease: deep brain stimulation and FDOPA-PET Groningen: s.n.

Proteinuria-associated renal injury and the effects of intervention in the renin-angiotensinaldosterone

Chapter 2. Linkage Analysis. JenniferH.BarrettandM.DawnTeare. Abstract. 1. Introduction

Apoptosis and colorectal cancer. Studies on pathogenesis and potential therapeutic targets Koornstra, Jan

Genetics Review. Alleles. The Punnett Square. Genotype and Phenotype. Codominance. Incomplete Dominance

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

Goal-oriented hemodynamic treatment in high-risk surgical patients Sonneveld, Johan Pieter Cornelis

Multifactorial Inheritance. Prof. Dr. Nedime Serakinci

Apoptosis in (pre-) malignant lesions in the gastro-intestinal tract Woude, Christien Janneke van der

MOLECULAR EPIDEMIOLOGY Afiono Agung Prasetyo Faculty of Medicine Sebelas Maret University Indonesia

Orthotic interventions to improve standing balance in somatosensory loss Hijmans, Juha

Non-Mendelian inheritance

Innovation immunomodulatory treatment case study of men diagnosed with relapsing-remitting multiple sclerosis (MS)

Human Genetics 542 Winter 2018 Syllabus

Imaging Genetics: Heritability, Linkage & Association

SUSPECTED MECHANISMS INVOLVED IN MS AND PUTATIVE INTERACTIONS WITH HEPATITIS B VACCINE IN MS

Mendelian & Complex Traits. Quantitative Imaging Genomics. Genetics Terminology 2. Genetics Terminology 1. Human Genome. Genetics Terminology 3

Human Genetics 542 Winter 2017 Syllabus

COPYRIGHT 2012 THE TRANSVERSE MYELITIS ASSOCIATION. ALL RIGHTS RESERVED

The Inheritance of Complex Traits

The natural history of multiple sclerosis: a geographically based study 8: Familial multiple sclerosis

Regulatory enzymes of mitochondrial B-oxidation as targets for treatment of the metabolic syndrome Bijker-Schreurs, Marijke

AND RELAPSING REMITTING COURSES

Genetics of common disorders with complex inheritance Bettina Blaumeiser MD PhD

Epidemiology, Diagnosis, Natural History & Clinical Course

The role of the general practitioner in the care for patients with colorectal cancer Brandenbarg, Daan

University of Groningen. Left ventricular diastolic function and cardiac disease Muntinga, Harm Jans

Lecture 17: Human Genetics. I. Types of Genetic Disorders. A. Single gene disorders

MULTIPLE SCLEROSIS Update

Citation for published version (APA): Brinkman, J. W. (2007). Albuminuria as a laboratory risk marker: Methods evaluated s.n.

University of Groningen. Symptomatic and asymptomatic airway hyperresponsiveness Jansen, Desiree

Tutorial on Genome-Wide Association Studies

THE NATURAL HISTORY OF MS: DIAGNOSIS, CLINICAL COURSE, AND EPIDEMIOLOGY

Transmission Disequilibrium Test in GWAS

Neurodevelopmental outcome of children born following assisted reproductive technology Middelburg, Karin Janette

University of Groningen. The tryptophan link to psychopathology Russo, Sascha

Clinical applications of positron emission tomography in coronary atherosclerosis Siebelink, Hans-Marc José

Meiotic Mistakes and Abnormalities Learning Outcomes

Chapter 1 : Genetics 101

National Disease Research Interchange Annual Progress Report: 2010 Formula Grant

Research Article Optic Nerve and Spinal Cord Are the Major Lesions in Each Relapse of Japanese Multiple Sclerosis

Citation for published version (APA): Verdonk, R. C. (2007). Complications after liver transplantation: a focus on bowel and bile ducts s.n.

Towards strengthening memory immunity in the ageing population van der Heiden, Marieke

Multiple Sclerosis. Biology 12 Nervous System Project

Effects of Stratification in the Analysis of Affected-Sib-Pair Data: Benefits and Costs

Effects of hormone treatment on sexual functioning in postmenopausal women Nijland, Esmé Aurelia

HST.161 Molecular Biology and Genetics in Modern Medicine Fall 2007

Role of multidrug resistance-associated protein 1 in airway epithelium van der Deen, Margaretha

CNS pathology Third year medical students. Dr Heyam Awad 2018 Lecture 4: Myelin diseases of the CNS

University of Groningen. Cardiotoxicity after anticancer treatment Perik, Patrick Jozef

Ch 8 Practice Questions

Ascertainment Through Family History of Disease Often Decreases the Power of Family-based Association Studies

Non-parametric methods for linkage analysis

Systems of Mating: Systems of Mating:

Citation for published version (APA): Lutke Holzik, M. F. (2007). Genetic predisposition to testicular cancer s.n.

Insulin sensitivity of hepatic glucose and lipid metabolism in animal models of hepatic steatosis Grefhorst, Aldo

Psych 3102 Lecture 3. Mendelian Genetics

Fast Facts: Multiple Sclerosis

CS2220 Introduction to Computational Biology

The new Global Multiple Sclerosis Severity Score (MSSS) correlates with axonal but not glial biomarkers

Citation for published version (APA): Ruis, M. A. W. (2001). Social stress as a source of reduced welfare in pigs s.n.

Citation for published version (APA): Leeuw, K. D. (2008). Premature atherosclerosis in systemic autoimmune diseases s.n.

University of Groningen

Antigen Presentation to T lymphocytes

Multifactorial Inheritance

A. Incorrect! Cells contain the units of genetic they are not the unit of heredity.

Introduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016

Citation for published version (APA): Appelo, M. T. (1996). Bottom-up rehabilitation in schizophrenia Groningen: s.n.

University of Groningen. Raiders of the CNS Vainchtein, Ilia Davidovich

Major role of the extracellular matrix in airway smooth muscle phenotype plasticity Dekkers, Bart

The psychophysiology of selective attention and working memory in children with PPDNOS and/or ADHD Gomarus, Henriette Karin

Stat 531 Statistical Genetics I Homework 4

Evolution II.2 Answers.

Vascular Endothelial Growth Factor, diagnostic and therapeutic aspects Kusumanto, Yoka Hadiani

Physical activity and physical fitness in juvenile idiopathic arthritis Lelieveld, Otto

Jay M. Baraban MD, PhD January 2007 GENES AND BEHAVIOR

Mapping of genes causing dyslexia susceptibility Clyde Francks Wellcome Trust Centre for Human Genetics University of Oxford Trinity 2001

Introduction to Genetics and Genomics

What is the relationship between genes and chromosomes? Is twinning genetic or can a person choose to have twins?

Genetics and Pharmacogenetics in Human Complex Disorders (Example of Bipolar Disorder)

Clinical and research application of MRI in diagnosis and monitoring of multiple sclerosis

Nature Genetics: doi: /ng Supplementary Figure 1

Indian Journal of Nephrology Indian J Nephrol 2001;11: 88-97

Transcription:

1 Genetic aspects of Multiple Sclerosis

Ontwerp en opmaak: Illustratie omslag: Druk: A twee A, Foelke Vos Sytse van der Zee Stichting Drukkerij C. Regenboog, Groningen ISBN 90-9020291-9 2005 M. Boon The studies in this thesis were financially supported by a grant from Stichting MS Research. 2

Rijksuniversiteit Groningen Genetic aspects of Multiple Sclerosis Proefschrift ter verkrijging van het doctoraat in de Medische Wetenschappen aan de Rijksuniversiteit Groningen op gezag van de Rector Magnificus, dr. F. Zwarts, in het openbaar te verdedigen op woensdag 4 januari 2006 om 16.15 uur door Maartje Boon geboren op 28 januari 1966 te Nijmegen 3

Promotores: Copromotor: Prof. dr. J.H.A. De Keyser Prof. dr. C.H.C.M. Buys Dr. ir. G.J. te Meerman Beoordelingscommissie: Prof. dr. J.J.A. Mooij Prof. dr. R.C. Jansen Prof. dr. J.B.M. Kuks 4 Mapping of a susceptibility gene

Voor mijn vader en moeder Voor Dick en David 5

Paranimfen: R.M.A. Boon A.E. Bollen

Contents 1 General introduction 9 2 Genetic epidemiology of Multiple Sclerosis: a review of the literature 17 3 Mapping of a susceptibility gene for Multiple Sclerosis to the 51 kb interval between G511525 and D6S1666 using a new method of haplotype sharing analysis 55 4 Inheritance mode of multiple sclerosis: the effect of HLA class II alleles is stronger than additive 81 5 Genetic difference between relapsing and primary progressive Multiple Sclerosis 95 6 Summary and conclusions 115 7 Samenvatting 121 Glossary of genetic and statistical terms 126 Dankwoord 131 Curriculum vitae 134 Publications 135

8

Chapter 1 General introduction 9

Multiple sclerosis (MS) is an inflammatory demyelinating disorder of the central nervous system. It is relatively common with a frequency of 0.1-0.2% in Northern European countries. The age at onset is most often between 20 and 40 years. Women are affected more often than men with a female-male ratio between 1.4. and 2 1. Clinical symptoms Clinically, MS is characterized by focal neurological symptoms of the optic nerves, spinal cord and brain. The frequency of symptoms at presentation is related to age 2. The most common initial symptom at any age is sensory, consisting of tingling or numbness. Optic neuritis is much more prevalent as a presenting symptom in patients under 20 than patients over 40 years old. Insidious motor symptoms, typically chronic progressive myelopathy, are characteristic of patients in their forties. Other common initial symptoms are diplopia, vertigo, limb ataxia, impairment of balance or acute motor symptoms. As the disease progresses additional symptoms may be spasticity, pain, fatigue, dysarthria, disorders of micturition and bowel voiding, sexual disturbances, cognitive dysfunction and paroxysmal phenomena (this list is not limitative). Disability is measured by various scales 3-6 that all seem to have their own advantages and limitations. Disease course Most often the initial course is relapsing-remitting (80%), with exacerbations and remissions during which the patient recovers completely or partially 7. The relapsing phase is followed by a secondary progressive phase in approximately two thirds of the patients after on average 10-15 years. During the secondary progressive phase there is gradually worsening disability in between eventual relapses. However, up to one third of the patients do not develop progressive disability and remain unimpaired for many years (benign MS). A smaller number of patients (10-15%) experiences gradual progression from onset and this course is called primary progressive. Rarely, the disease is rapidly fatal. Diagnosis The diagnosis of MS is based upon evidence of two or more lesions in the central nervous system (CNS), dissociated both in time and space. Initially, the Schumacher criteria 8 were used to identify patients with clinically definite MS for inclusion in clinical trials. According to these criteria, lesions should be present predominantly in the white matter and disseminated in time and space. 10 General introduction

Table 1: The Schumacher criteria Evidence on examination or in patient s history of two or more lesions in the CNS Relapsing remitting disease Two or more episodes separated by at least one month and lasting a minimum of 24 hours Progressive disease Persistent deterioration for six months Onset between ages of 10 and 50 years No better explanation for the symptoms in the opinion of a physician competent in clinical neurology In 1983 Poser 9 proposed criteria that incorporated also paraclinical and laboratory findings to be used as support for the diagnosis (Table 2). These criteria were used to include patients in our studies as reported in this thesis. Table 2: The Poser criteria Category Attacks Clinical Paraclinical CSF evidence evidence A. Clinically definite MS CDMS A1 2 2 CDMS A2 2 1 and 1 B. Laboratory-supported definite MS LSDMS B1 2 1 or 1 + LSDMS B2 1 2 + LSDMS B3 1 1 and 1 + C. Clinically probable MS CPMS C1 2 1 CPMS C2 1 2 CPMS C3 1 1 and 1 Laboratory-supported probable MS LSPMS D1 2 + 11 Chapter 1

In the cerebrospinal fluid (CSF), the presence of abnormally high levels of immunoglobulin G (IgG) is supportive of MS. To exclude the possibility of leakage of IgG through an impaired blood-brain barrier, the concentrations of IgG and albumin in CSF and serum are compared and expressed as the IgG-index. IgG in CSF IgG in serum : albumin in CSF albumin in serum CSF and plasma are also examined by isoelectric focussing (IEF). Oligoclonal IgG bands present in CSF and not in serum are strongly supportive of the diagnosis of MS. However, oligoclonal bands can also be found in patients with other diseases like inflammatory and /or infectious diseases of the CNS. Of MS patients, 70% have an IgG index greater than 0.7 and 95% have oligoclonal bands 10.A modestly increased number of white blood cells in the CSF (5-35/mm 3 ) is found in about half of MS patients, whereas numbers higher than 35-50/mm 3 are very rare. Evoked potentials may provide evidence of focal dysfunction within the optic nerve or CNS, even if symptoms are too subtle to be found at clinical examination. Visual, auditory, somatosensory and motor evoked potentials can be investigated. There is, however, only limited information as to the sensitivity and specificity of evoked potentials in MS. More recently, new criteria incorporating also the results of magnetic resonance imaging (MRI) have been introduced 11,12. Pathogenesis In the pathogenesis, autoimmunity is likely to play a major role 13.Proteins that form part of the myelin sheath are suspected to be the target of the immunological reaction. However, some authors examine an alternative hypothesis of pathogenesis, suggesting that MS is a neurodegenerative and metabolic disorder with polygenic influence 14. The heterogeneity of demyelinating lesions at pathological examination leaves room for both theories 15. Patterns showing similarities with T-cell-mediated or T-cell plus antibodymediated autoimmune encephalomyelitis as well as patterns suggestive of primary oligodendrocyte dystrophy were found. 12 General introduction

Etiology There is evidence for both genetic and environmental factors playing a role in susceptibility to MS. Indications for environmental factors include the correlation between prevalence and geographic localization 16,17, migration studies 18-22, possible clustering 23-26 and reports of specific antibodies in bodyfluids 27,28. The available epidemiologic data were recently reviewed by Marrie 29. Indications for genetic factors are the association with specific HLA alleles, increased risk for relatives proportional to the amount of DNA shared, twin studies and racial differences (chapter 2). Aims and scope of this thesis Subject of this thesis are the genetic factors playing a role in MS. These factors will be approached from a genetic epidemiological point of view. Major questions in this field are: what is the disease model of MS, regarding the role of genetic and environmental factors? Where are the genetic factors localized within the genome? What is the contribution of each locus? How is their inheritance mode (dominant, recessive, intermediate)? Do loci interact? Chapter 2 provides an review of genetic epidemiological studies with emphasis on genome screens. Since a large part of the genome had been excluded for a locus with major contribution to susceptibility, we focussed on the HLA region, the only region that has repeatedly been found to be involved in MS. In our first study (chapter 3) we tried to finemap a locus within the HLA region, using a newly developed method of haplotype sharing analysis. Chapter 4 describes the inheritance mode of the region we finemapped previously. In chapter 5 we review the differences between different types of MS (relapsing-onset and primary progressive) and investigate whether in our population a genetic difference can be found in the HLA region. 13 Chapter 1

References 1. Weinshenker BG. Epidemiology of multiple sclerosis. Neurol Clin 1996; 14(2):291-308. 2. Weinshenker BG, Bass B, Rice GP, Noseworthy J, Carriere W, Baskerville J et al. The natural history of multiple sclerosis: a geographically based study. I. Clinical course and disability. Brain 1989; 112 ( Pt 1):133-146. 3. Kurtzke JF. Disability rating scales in multiple sclerosis. Ann N Y Acad Sci 1984; 436:347-360. 4. Hobart J, Kalkers N, Barkhof F, Uitdehaag B, Polman C, Thompson A. Outcome measures for multiple sclerosis clinical trials: relative measurement precision of the Expanded Disability Status Scale and Multiple Sclerosis Functional Composite. Mult Scler 2004; 10(1):41-46. 5. Hobart JC, Riazi A, Thompson AJ, Styles IM, Ingram W, Vickery PJ et al. Getting the measure of spasticity in multiple sclerosis: the Multiple Sclerosis Spasticity Scale (MSSS-88). Brain 2005. 6. Wingerchuk DM, Noseworthy JH, Weinshenker BG. Clinical outcome measures and rating scales in multiple sclerosis trials. Mayo Clin Proc 1997; 72(11):1070-1079. 7. Lublin FD, Reingold SC. Defining the clinical course of multiple sclerosis: results of an international survey. National Multiple Sclerosis Society (USA) Advisory Committee on Clinical Trials of New Agents in Multiple Sclerosis. Neurology 1996; 46(4):907-911. 8. Schumacher GA, Beebe G, Kibler RF, Kurland LT, Kurtzke JF, McDowell F et al. Problems of experimental trials of therapy in Multiple Sclerosis: report by the panel on the evaluation of experimental trials of therapy in Multiple Sclerosis. Ann N Y Acad Sci 1965; 122:552-568. 9. Poser CM, Paty DW, Scheinberg L, McDonald WI, Davis FA, Ebers GC et al. New diagnostic criteria for multiple sclerosis: guidelines for research protocols. Ann Neurol 1983; 13(3):227-231. 10. Andersson M, Alvarez-Cermeno J, Bernardi G, Cogato I, Fredman P, Frederiksen J et al. Cerebrospinal fluid in the diagnosis of multiple sclerosis: a consensus report. J Neurol Neurosurg Psychiatry 1994; 57(8):897-902. 11. McDonald WI, Compston A, Edan G, Goodkin D, Hartung HP, Lublin FD et al. Recommended diagnostic criteria for multiple sclerosis: guidelines from the International Panel on the diagnosis of multiple sclerosis. Ann Neurol 2001; 50(1):121-127. 12. Polman CH, Reingold SC, Edan G, Filippi M, Hartung HP, Kappos L et al. Diagnostic criteria for multiple sclerosis: 2005 revisions to the "McDonald Criteria". Ann Neurol 2005. 14 General introduction

13. Hafler DA, Slavik JM, Anderson DE, O'Connor KC, De Jager P, Baecher-Allan C. Multiple sclerosis. Immunol Rev 2005; 204:208-231. 14. Chaudhuri A, Behan PO. Multiple sclerosis is not an autoimmune disease. Arch Neurol 2004; 61(10):1610-1612. 15. Lucchinetti C, Bruck W, Parisi J, Scheithauer B, Rodriguez M, Lassmann H. Heterogeneity of multiple sclerosis lesions: implications for the pathogenesis of demyelination. Ann Neurol 2000; 47(6):707-717. 16. Kurtzke JF. MS epidemiology world wide. One view of current status. Acta Neurol Scand Suppl 1995; 161:23-33. 17. Poser CM. The epidemiology of multiple sclerosis: a general overview. Ann Neurol 1994; 36 Suppl 2:S180-S193. 18. Kurtzke JF, Dean G, Botha DP. A method for estimating the age at immigration of white immigrants to South Africa, with an example of its importance. S Afr Med J 1970; 44(23):663-669. 19. Kurtzke JF. Epidemiologic evidence for multiple sclerosis as an infection. Clin Microbiol Rev 1993; 6(4):382-427. 20. Kurtzke JF, Delasnerie-Laupretre N, Wallin MT. Multiple sclerosis in North African migrants to France. Acta Neurol Scand 1998; 98(5):302-309. 21. Dean G, Kurtzke JF. On the risk of multiple sclerosis according to age at immigration to South Africa. Br Med J 1971; 3(777):725-729. 22. Elian M, Nightingale S, Dean G. Multiple sclerosis among United Kingdom-born children of immigrants from the Indian subcontinent, Africa and the West Indies. J Neurol Neurosurg Psychiatry 1990; 53(10):906-911. 23. Kurtzke JF, Hyllested K. Multiple sclerosis in the Faroe Islands: I. Clinical and epidemiological features. Ann Neurol 1979; 5(1):6-21. 24. Kurtzke JF, Hyllested K, Heltberg A. Multiple sclerosis in the Faroe Islands: transmission across four epidemics. Acta Neurol Scand 1995; 91(5):321-325. 25. Poser CM, Hibberd PL. Analysis of the 'epidemic' of multiple sclerosis in the Faroe Islands. II. Biostatistical aspects. Neuroepidemiology 1988; 7(4):181-189. 26. Poser CM, Hibberd PL, Benedikz J, Gudmundsson G. Analysis of the 'epidemic' of multiple sclerosis in the Faroe Islands. I. Clinical and epidemiological aspects. Neuroepidemiology 1988; 7(4):168-180. 27. Alotaibi S, Kennedy J, Tellier R, Stephens D, Banwell B. Epstein-Barr virus in pediatric multiple sclerosis. JAMA 2004; 291(15):1875-1879. 28. Croxford JL, Olson JK, Anger HA, Miller SD. Initiation and exacerbation of autoimmune demyelination of the central nervous system via virus-induced molecular mimicry: implications for the pathogenesis of multiple sclerosis. J Virol 2005; 79(13):8581-8590. 29. Marrie RA. Environmental risk factors in multiple sclerosis aetiology. Lancet Neurol 2004; 3(12):709-718. 15 Chapter 1

16

Chapter 2 Genetic epidemiology of Multiple Sclerosis: a review of the literature Maartje Boon Jacques De Keyser Charles H.C.M. Buys Gerard J. te Meerman Submitted for publication 17

Multiple Sclerosis (MS) is a disease characterized by inflammation and demyelination of the central nervous system (CNS). The severity and localization of the symptoms are very variable, from little or no residual symptoms over the years to severe handicap or even death within a few weeks to months. MS is a major cause of disability among young adults 1,2. In the pathogenesis, autoimmunity is likely to play a major role 3.Proteins that form part of the myelin sheath are suspected to be the target of the immunological reaction. However, some authors examine an alternative hypothesis of pathogenesis, suggesting that MS is a neurodegenerative and metabolic disorder with polygenic influence 4. The etiology of MS is unknown, but there is evidence for both genetic and environmental factors playing a role in susceptibility to MS. Indications for environmental factors include the correlation between prevalence and geographic localization 5,6, migration studies 7-11, possible clustering 12-15 and reports of specific antibodies in bodyfluids 16,17. The available epidemiologic data were recently reviewed by Marrie 18. For almost all diseases, the genetic make-up of an individual will to some extent determine susceptibility to and expression of a given disease. Some diseases are entirely genetic in origin (e.g. myotonic dystrophy), although environmental factors may influence the course of the disease. In other diseases the balance weighs to the other side, e.g. in infectious diseases where an individual can be more or less susceptible to the external pathogen that is the primary cause of the disease. The influence of genes in susceptibility to a certain disease can be investigated by observational epidemiological studies (e.g. twin studies), association and linkage studies and alternative methods like haplotype sharing analysis. 18 Genetic epidemiology of MS

2.1 2.1.1 Observational studies Studies of familial aggregation Over a century ago Eichhorst called MS an inherited, transmissible disease 19. The main indication for this hypothesis was the observation that MS tended to show familial aggregation. Studies of familial aggregation examine whether the disease prevalence in genetically related family members of affected individuals is increased compared with the prevalence of the disease in the general population. If so, this is an indication for the influence of genetic factors on susceptibility to the disease. A chi-squared test of association is used to determine whether there is an increased disease frequency in the relatives of affected individuals as compared with controls. A 2x2 table is analyzed for a difference in proportions. A significant result indicates an increased (or decreased) incidence of disease in relatives of affected individuals as compared with controls. Another statistic in this context is the familial relative risk (RR), which has the advantage that it quantifies the degree of risk to relatives. The RR is also denoted as. Itis calculated as the ratio of the disease rate in relatives of affected individuals to the disease rate in relatives of controls. A RR significantly larger than 1 indicates that the disease is more prevalent in the relatives of affected individuals than in the general population (inferred from the controls). The RR can be underestimated because of incomplete or reduced penetrance of the disease or high frequency of the disease susceptibility gene. Interesting in this respect are the studies that show biochemical and radiological manifestations of MS in clinically healthy sibs of MS patients (see paragraph Subclinical MS? ). Overestimation can occur because of ascertainment bias: patients with a disease are more likely to know of individuals in their family with the same disease than unaffected controls are. An alternative for the familial RR is the population RR, where the probability of a relative of an affected individual being affected is compared with the probability of a random member of the population being affected. In many of the classical family studies, the observations on recurrence of MS in families are reported as the frequency of familial cases, i.e. the proportion of MS index cases that have a close relative with MS. This parameter, however, also depends on the family sizes in the population and the mode of inheritance. Therefore, a better parameter to use is the prevalence of MS among specific types of relatives in proportion to the total number of relatives with the same family relation. In order to examine whether the 19 Chapter 2

prevalence among relatives is increased, it needs to be compared to the prevalence in the population concerned and ideally corrected for age and sex. Table 1: Comparison of age-adjusted lifetime risks by relationship to the proband - data for a Northern European population living in a temperate climate. Relationship to proband Approximate Relative risk to % Genetic sharing risk (%) general population with the proband General population 0.2 1 0 First-degree relative 3-5 15-25 50 Dizygotic twin 3-5 15-25 50 Monozygotic twin 38 190 100 Adopted first-degree relative 0.2 1 0 Half-sib 1.3 6.5 25 Offspring of conjugal MS a 29.5 147.5 50 b a b It may be more appropriate to compare crude rates for this group The child shares 50% of the genetic material with the affected mother and 50% of the genetic material with the affected father (From Sadovnick et al. 20 ) The recurrence risk for relatives of MS patients is increased proportional to the degree of kinship, probably as a manifestation of the amount of DNA shared. The risk for relatives is increased, depending on the sex of the index patient and the relative, the type of kinship and the age of the relative. The age-adjusted recurrence rate for first-degree relatives appears to be 1.3-5%. Since the population risk is 0.1-0.2%, this results in an RR ( r,recurrence risk of relative/population risk) up to 40 or 50. The relative risk for second degree relatives is around 3 and in third degree relatives the risk is with 0.3 1.5% still considerably higher than the population risk 21-29. 20 Genetic epidemiology of MS

2.1.2 Shared environment vs genes in familial clustering Although familial clustering may also be the consequence of shared environment, various types of epidemiological studies argue against the importance of this factor in familial clustering in MS, unlike in regional clustering 28,30. Twin studies Twin studies are a classical method to discriminate between the contributions of genes and environment in the etiology of diseases. Monozygotic twins are genetically almost identical, whereas dizygotic twins share on average 50% of their DNA, like sibs that are not twins. The 50% of DNA that dizygotic twins share on average is due to the fact that for 25% of the genome they share both chromosomes (100%), for 50% of the genome they share one chromosome (50%) and for 25% of the genome they share no chromosomes (0%). Discordance for a trait between monozygotic twins is thus mainly attributed to environmental and not to genetic factors. In dizygotic twins, both genetic and environmental factors may be responsible for discordance. A possible source of bias in twin studies is excess inclusion of concordant pairs compared to discordant pairs. This type of bias, however, should not be different in dizygotic and monozygotic twins. For MS, several twin studies have shown a consistent difference in concordance between monozygotic and dizygotic twins. Concordance in monozygotic twins is around 30%, whereas in dizygotic twins it is 3-5% 31-37. However, interpreting these numbers one has to realize that sometimes the healthy twin of a discordant pair of identical twins does have neurological symptoms, oligoclonal bands in the CSF or MRI abnormalities without ever meeting the criteria of MS 31,36,38. Another limitation of twin studies is that the sample will necessarily be relatively small 37. An interesting observation from twinstudies is, that even if two individuals are more or less genetically identical, still in 70-75% of the cases they are discordant for MS. Postzygotic events during formation of the embryo and placenta 39 and epigenetic effects later in life 40 are likely to play a role, also influenced by environmental factors. Differences in mitochondrial DNA between members of a twin pair are another possible influence 41,42. 21 Chapter 2

Studies of half-sibs, adoptees and conjugal pairs Ebers et al. found an age-adjusted recurrence risk for full-siblings of a patient of 3.11%, whereas for half-siblings in the same families it was 1.89%, indicating the effect of the amount of sharing of DNA 43. There was a difference between paternal (1.31%) and maternal (2.35%) half-sibs, suggestive of a maternal effect. This may be due to various factors, e.g. environmental (pre-and postnatal), genetic (mitochondrial) or epigenetic (parental imprinting). There was no difference between the risk of half-sibs who did and did not live with the MS patient. This observation supports the hypothesis that the familial aggregation of MS is mainly caused by genetic, rather than environmental factors. Non-biological first degree relatives of adopted patients have a risk comparable to the population risk 30. Biological relatives that do not live with the patient, however, have the same risk as relatives from intact nuclear families, despite the ascertainment bias. This again indicates that genetic factors are more important than shared environment in familial aggregation of MS. Children of conjugal pairs both with MS have a recurrence risk of MS of 1 : 17, age-adjusted 1 : 5, compared to a risk of 1 : 200, age-adjusted 1 : 50, for children of single affected parents. To assess the influence of environmental factors, conjugal pairs were compared. There was no evidence for clinical concordance, clustering at year of onset or distortion of expected pattern of age at onset in the second affected spouse 44,45. The Canadian data also showed that the rate of MS among spouses of patients was not increased compared with the population risk, arguing against transmission 45. 2.1.3 Effect of consanguinity In several of the clusters described, consanguinity appears to play a role 46,47.If a patients unaffected parents are first cousins, the recurrence risk for his/her siblings is nearly four times increased compared with the situation when the parents were unrelated 48. This effect has been related to a recessive inheritance mode of susceptibility alleles. 2.1.4 Subclinical MS? In epidemiological studies, most often only patients with definite or probable MS are considered affected. However, if clinically unaffected relatives of patients are examined by means of laboratory techniques, the number of 22 Genetic epidemiology of MS

23 Chapter 2 relatives showing abnormalities considered characteristic for MS is higher. MRI studies in relatives of MS patients show lesions, compatible with MS, in part of the clinically unaffected relatives 49-52. Likewise, examining healthy sibs in multiplex families Duquette et al. reported oligoclonal bands in 18% 53.Xu et al. found oligoclonal bands in the CSF of 13 of 15 clinically unaffected twins of MS patients, 3 of whom developed clinical manifestations of MS in the 6 year follow-up period 54. Haghighi et al. found two or more oligoclonal bands in 9/47 healthy sibs of MSpatients as opposed to 2/50 controls 55. Nuwer et al. have shown abnormal evoked potentials in up to 35% of healthy sibs of patients from multiplex families 56.MS lesions have also been found at autopsy in asymptomatic individuals 57,58. There may be various explanations for these findings. The sibs that showed abnormal test results may have a subclinical type of MS that may or may not become clinically manifest at follow up (at this point, data are limited). It is possible that they carry the same genetic burden their affected sibs are carrying, but there is reduced penetrance. Differences in environmental factors or age at which sibs are exposed to these factors may play a role. If one assumes polygenic inheritance and/or interaction between genes contributing to MS, a sib could be more or less strongly predisposed to MS if he or she carries more or fewer disease-related alleles. Another explanation for the presence of subclinically affected sibs is, that the disease has a very mild course and the lesions are localized in clinically silent areas of the brain. For the spinal cord, however, this might be harder to argue. Differences in frequencies of MS in various ethnic or racial groups 59 have been described. In epidemiology, these differences are often considered preliminary evidence of genetic contribution to the disease, since it is difficult to establish that the environments of the groups are the same.

2.2 Segregation/ model of disease For many Mendelian diseases, underlying mutations in specific genes have been found. Over all, mutational diversity at each of these loci is high, each mutation is rare, having occurred in recent human history (no older than 2000 years) and each mutation is sufficient and necessary to cause the phenotype of interest. However, most human phenotypes and diseases are complex and Mendelian patterns do not apply. The segregation of MS in multiplex families is not consistent with a fully penetrant single gene disorder. The prevalence in relatives of MS patients decreases with increasing genetic distance in a non-lineary fashion, also pleading against single locus Mendelian inheritance. Epidemiologic characteristics suggest that MS is a complex disease. Complex diseases show familial aggregation but a high proportion of sporadic cases, no Mendelian pattern of inheritance, reduced penetrance, phenocopies, etiologic and genetic (locus and allelic) heterogeneity. It is suspected that mutations leading to a complex phenotype occur at multiple genes. The genetic model is often represented as a curve where affected individuals are those that cross a biological threshold of risk. The contributing alleles at multiple independent loci may either be rare or common, dependent upon the model applied. risk affected aa/++/++/.. ++/bb/++/.. ++/++/cc/.. aabbcc.. rare variants common variants Figure 1 Competing theories for complex disease inheritance (a,b,c and so on are susceptibility/protective alleles at multiple, independent loci). For a fixed disease incidence, individuals who are clinically affected can either have mutations at only one of many possible disease loci (in which case the mutant alleles are rare in the population) or harbour mutations at multiple loci simultaneously (in which case the mutant alleles are common in the population). These hypotheses are the extremes of many other possible intermediate scenarios. (From Chakravarti 60 ) 24 Genetic epidemiology of MS

25 Chapter 2 For MS, many models have been proposed and tested 61.Parametric methods incorporate the parameters known or suspected to influence susceptibility and its segregation. Calculations are made, varying values and interactions of these parameters. The results are compared with the observed data and the best fitting model is proposed. Sometimes, in order to make calculations feasible, assumptions may be made that are not realistic 62. Parametric methods specify a model for the disease in question, but the validity of this model is not always established. Therefore, nonparametric methods are often favoured 63,64.Nevertheless, (parametric) likelihood methods could in principle make better use of the data than non-parametric methods could, by interpreting it against the backdrop of evolutionary history that links the observed haplotypes 65. Non-parametric methods sometimes try to summarize information of all haplotypes in one value. Since similar values may result from different underlying evolutionary models in cases compared with controls, this might result in less power 65. Another limitation of non-parametric methods, especially when studying complex disorders, is that including covariates like other genetic or environmental factors is difficult 66. The best fitting model is likely to be a multifactorial model with a number of genes of which one major gene, situated in the HLA-DRDQ region and one or more environmental factors. The MHC is estimated to account for 10-60% of the genetic component of MS susceptibility in Caucasian populations of northern European descent 67,68.We have found indications that the inheritance mode of the HLA-related gene (or genes) is stronger than additive suggesting a recessive component 61.

2.3 Methods of genetic analysis In complex diseases, frequent variants of a number of genes, each with a modest effect may play a role. Phenocopies, genetic (locus and allelic) heterogeneity, reduced penetrance and contribution of environmental factors complicate the finding of susceptibility genes even more. With the sequencing of the human genome 69,new possibilities to learn about the genetic background of complex diseases have emerged. Millions of single nucleotide polymorphisms (SNPs) provide potential markers. However, it is as important to develop strategies to distinguish between polymorphisms with relevance for susceptibility to or influencing the course of a disease and those that are only in linkage disequilibrium with them 60.For example, associations between HLA-types and a number of diseases are known since decades, but identification of the causal variants has proven very difficult. 2.3.1 Linkage analysis Linkage analysis assesses the segregation of a genetic variant in families with multiple affected members. Classical linkage analysis within families will typically resolve the position of a novel gene to 10-20 cm, with further precise location obtained by using linkage disequilibrium mapping within this region 70,71. In complex diseases, linkage analysis has only had limited success 72. The majority of positive linkages for the same disease could not be replicated. No single study design consistently produced more significant results. The only variables independently associated with increased study success were increase in the number of individuals studied and sampling from only one ethnic group. Kruglyak and Lander 73 have shown that for loci with a modest RR the sample size needed for high resolution linkage mapping is very large. The power of a linkage genome screen is dependent upon the frequency of susceptibility alleles in the population studied and may thus vary between populations 74. In multiple sclerosis, linkage screens have so far had limited results 75. Although every genome screen identified regions of interest, none has demonstrated linkage with genomewide significance. Meta-analysis of the available raw data of 11 whole genome screens for linkage, using microsatellite markers with a density of about one marker every 10cM, included over 700 multiplex families. However, only in the MHC region linkage with genomewide significance was found 76. The apparent technical limitations were likely to have reduced the power of the analyses. This stimulated the 26 Genetic epidemiology of MS

International Multiple Sclerosis Genetics Consortium to conduct a linkage screen in 730 multiplex families (comprising 2692 individuals) using a set of almost 6000 markers 75. 2.3.2 Association analysis Association studies test whether a genetic marker (polymorphism or haplotype) occurs more frequently in cases than in controls. If significant association emerges and the possible bias of population stratification is excluded, the polymorphism itself is either in the susceptibility locus or in linkage disequilibrium with the susceptibility locus. Careful application of the method is warranted in order to avoid several confounding factors such as variable definitions of the phenotypes, the aforementioned population stratification or methodological flaws such as insufficient numbers 74,77,78. In order to avoid false positive results caused by population stratification, family-based controls can be used 79,80,for example trios consisting of a patient and both parents. The non-transmitted haplotypes of the parents can be used as (pseudo-)controls. In monogenic diseases, single-locus association methods have proven to be powerful tools in mapping disease loci and identification of genes. However, in complex diseases single-locus methods have so far had limited success. In 1997, genome-wide association screening using pooled DNA and 6000 microsatellite markers was thought to be an efficient strategy to find predisposing genes outside the HLA-region 81.For MS, a number of studies did not show significant association (see Table 2). DNA was pooled to reduce the number of assays. This pooling of DNA appeared to be complicated by specific artifacts. In pooled DNA, there is a tendency to overestimate the frequency of short alleles, partly as a consequence of stutter bands that may be hard to distinguish from actual peaks. The use of pooled DNA is also hampered by length-dependent amplification, distorting allele frequencies 82,83. If indeed a complex disease is caused by a number of genes, and for any of these genes by frequent alleles with each a small effect, very large sample sizes and much denser marker sets will be necessary to find these genes by means of association analysis 74.However, with the current availability of many SNPs and more efficient genotyping methods, mapping and identification of disease alleles contributing to complex diseases using association analysis may come within reach. 27 Chapter 2

Linkage disequilibrium (LD)-based methods 84,85 use multilocus association analysis. These methods are based on the assumption that affected individuals in the present generation have inherited their susceptibility alleles from common ancestors. A disease-related mutation has been introduced on a haplotype in a previous generation. With each generation, recombination events during meiosis may have led to reduction of the length of the original haplotype. Therefore, only if the distance between a marker locus and a disease locus is small, they are expected to be in LD. The strength of LD is influenced by a number of other factors such as number and age of disease mutations and population admixture. Association-based studies using LD mapping are more sensitive to find also minor genes. However, also with multilocus analysis, a sufficient density of markers is necessary. The marker density needed depends on the average extent of LD and the number of haplotype blocks 86-89. Kruglyak estimated on the basis of simulation studies that a useful level of LD is unlikely to extend beyond an average distance of about 3 kb in the general population. This implies that approximately 500.000 to 1.500.000 SNPs will be required for whole genome studies. The use of microsatellite markers (that are multiallelic and thus potentially more informative) will probably be too inefficient and expensive in association studies. According to Kruglyak, the extent of LD is similar in isolated populations, unless the founding bottleneck is very narrow or the frequency of the variant is low 86. Other authors indicate that LD may be spanning megabases in special populations 90. In that case, much lower marker densities could generate substantial power. Issues concerning LD and haplotype blocks are reviewed by Wall and Pritchard 89. Recently a special type of association analysis, admixture mapping, has been used to identify genes contributing to MS 91. Admixture mapping tries to identify genomic regions where individuals with MS from admixed populations tend to have an unusually high proportion of ancestry from either one or the other population of origin. This may be an indication for the presence of a multiple sclerosis risk variant that differs in frequency between the ancestral populations. Admixture mapping has statistical power to detect factors that differ markedly in frequency across populations. An advantage is, that for instance in the African American population there has been contact between the ancestral populations for on average six generations. Thus, few recombinations have taken place and long conserved segments of ancestral chromosomes allow less dense marker sets. Reich et al. 91 reported strong association between the extent of European ancestry and MS around the centromere of chromosome 1. However, they did not find association with HLA. This is the consequence of a limitation of admixture 28 Genetic epidemiology of MS

mapping: it cannot detect disease loci at which the total risk summed over all alleles in each population is similar in both populations. In order to confirm that a true disease-related SNP has been identified, replication of the association in another, independent sample is required. This is more likely to be successful using multi-locus (haplotypic) association than single-locus association methods. The probability of a unique haplotype being associated with a disease will be high, since frequencies of haplotypes are much lower than those of alleles. Another issue concerns correction for multiple testing using correction methods such as Bonferroni s or Holm s. These methods are often conservative with as a consequence a reduction in power 92,93.Different statistic methods have been developed reducing the number of comparisons 94. The choice of a lodscore of 3 implies some correction for multiple testing in monogenic disease where a single gene causing the disease is likely. However, the appropriate multiple testing correction should depend on the number of independent tests. When the tests of adjacent markers are strongly correlated, a lower correction factor is required than when there is less dependency between tests, as in single locus association analysis. The degree of LD is an indication for the correlation between tests at consecutive loci in multilocus methods. Apart from multiple testing correction, the a priori probability of a gene with an effect that can be found with the current sample size is unknown, which leads to a situation where only replicated results can be trusted 95. 2.3.3 Non-parametric methods for gene mapping Affected sibpair 96,97 and pedigree member 98-100 studies compare genotypes of two or more affected individuals from a pedigree in order to find genomic areas which are shared in excess. The problem of non-penetrance is avoided by this approach, but the problems of phenocopies and heterogeneity are not. In complex diseases, also the relatively low number of multiplex families often reduces the power of these methods. This is also the case in a number of studies in MS were these methods were applied 101-105. The Transmission/disequilibrium test evaluates deviation from linkage equilibrium from a heterozygous parent to his/her affected child 106. Linkage equilibrium would imply that both alleles have equal probability to be transmitted (both 50%). If there is deviation from linkage equilibrium and one allele is transmitted preferentially to affected offspring, it is likely to be 29 Chapter 2

associated with the disease in question. However, the TDT only shows significant association when the marker is linked to the disease locus 106,107. When applied to more complex diseases this method failed to identify relevant gene loci. Apart from the fact that most studies into complex genetic disease have too limited sample size, this can be attributed to genetic effects for which the TDT has reduced power, e.g. effects of frequent alleles of specific genes. This may be the case in MS as well 105,108-112 (see also Table 2), but studies with larger numbers of patients show interesting results concerning the HLA region 113,114. 2.3.4 The Haplotype Sharing Statistic Analysis of haplotype overlap can be used to map a disease gene. Once it is certain that haplotypes contain a disease gene, the smallest fragment shared by patients and not by controls is likely to contain this gene. However, when it is not certain that haplotypes contain a disease gene, the overlap of haplotypes needs to be evaluated statistically. Te Meerman and Van der Meulen developed the Haplotype Sharing Statistic (HSS), a non-parametric multilocus association method for finemapping disease genes 115-117. They described the perspectives of identity by descent (IBD) mapping in founder populations and showed the method to be efficient due to the large number of meioses implicitly observed and reduced heterogeneity in these populations 115.However, the applicability of the method is not limited to founder populations in the narrow sense, as with the current ultra-high marker density many populations show LD in sets of many nearby markers. The method can handle the situation of multiple introductions of a particular disease mutant and thus different surrounding haplotypes. Nolte et al. developed the method further 118. HSS compares the length of shared haplotypes among patients with the length of shared haplotypes among controls. The length of haplotype sharing is taken to be an approximate measure for the probability of two haplotypes being identical by descent (IBD). The probability of IBD increases with increasing length of haplotype sharing, although not linearly 119. The hypo-thesis is, that haplotype sharing among patients is larger than among controls at loci involved in the disease, because (i) haplotypes containing risk alleles are likely to be similar more often especially close to the risk allele and (ii) haplotypes containing the risk alleles may be shared over longer stretches. The first factor is understandable with the concepts of association and LD. The second factor is the consequence of the patient haplotypes 30 Genetic epidemiology of MS

containing mutations that are genetically younger than the wild type allele. Thus, fewer meioses and consequently fewer recombinations have taken place on the patient haplotypes containing a disease mutation than on the wild type haplotype. Figure 2: The sharing of two haplotypes The sharing of two haplotypes at a locus is defined as the number of intervals between successive marker loci at which alleles of these haplotypes are identical (Figure 2). In order to know the phase of the haplotypes trios are used, preferably consisting of an affected individual and his or her parents. The non-transmitted haplotypes of the parents are used as controls. If parents are not available, a spouse and child can be used. In that case, the haplotypes of the spouse serve as controls. However, there is a risk of population stratification that is not present when non-transmitted haplotypes are used. The sharing of all pairs of haplotypes is calculated for each marker locus for the patient and control datasets separately and the results of the datasets are compared. HSS is more powerful and more accurate in fine-mapping than association analysis and the TDT for high frequency risk alleles that are considerably younger than the wildtype alleles. If, however, the risk alleles and the 31 Chapter 2

wildtype alleles are of the same age, HSS has no power but association analysis and TDT have. Nolte et al. demonstrated that HSS extracts different information from the data than do association analysis and TDT 118. As Jorde 120 concluded, it is unlikely that a single method would provide optimal power under all circumstances to detect susceptibility genes for complex diseases. Moreover, most often the underlying genetic model is not known. Therefore, it is recommendable to apply different methods and compare the results or even combine p-values after correction for correlation between statistics. Haplotype sharing analysis was applied in MS on a genomewide 121 and candidate region 109,122-124 basis. 2.3.5 Candidate genes Association studies into candidate genes that show positive results will only rarely be confirmed by a second investigation. This is caused by a number of factors, including too small sample sizes and publication bias 125. Ioannidis even demonstrated that most published research findings are false 95. This is the consequence of bias and lack of study power, but also the publication selection effect of many other studies on the same question and the ratio of true relationships to no relationships among those tested in the field, i.e. the prior probability of the research finding being true. A large number of genes have been investigated for a possible role in susceptibility to MS (ref. www.ucsf.edu/msdb/r_ms_candidate_genes.html). The only association that has repeatedly been confirmed is the association with alleles of the human leukocyte antigen (HLA) system. In 1972, an association between alleles of the HLA system and MS was reported by Jersild et al. 126. They demonstrated association with HLA class I antigens A3 and B7. This association provided the first direct evidence for a genetic contribution in the susceptibility to MS. Later, the association with HLA class II antigens appeared to be stronger than that with class I antigens. It has been confirmed by many investigators 127-129. In Northern European populations, the associated HLA-type is DR2, Dqw1, DNA-subtyping DRB1*1501, DQA1*0102, DQB1*0602 130-133.Differences in HLA-associations have been reported in different populations: DR15 and DR4 in Canary Islands 134, DR4 and DR15 in Turkey 135, DR4 and DR3 in Sardinia 136. The association may also be related to the type of MS (Western or Asian in Japan 137 ). It has been suggested that some haplotypes may be protective 114. Conserved haplotypes with recombination hotspots in between characterize the HLA region. This results in regions with strong linkage dis- 32 Genetic epidemiology of MS

equilibrium and reduced resolution to discriminate between genes on an ancestral haplotype. In most populations DRB1*1501 and DQB1*0602 are almost invariably linked, so it is still not clear whether DR or DQ or both are implied in susceptibility to MS. In our haplotype sharing analysis, the interval showing strongest strongest haplotype sharing contained DQB1 124. The importance of DQB1-alleles was supported by studies in populations were different combinations of DR2 and DQ6 were present 138,139,whereas others on the contrary found that DRB1 was more important than DQB1 129,140. The contribution of the extended haplotype HLA DRB1*1501, DQA1*0102, DQB1*0602 to the sibs was estimated 14% by Ligers et al 113. There is evidence for the involvement of different non-hla genes in HLA-DR15 positive compared with HLA-DR15 negative patients 141-143. An important question is, whether the association between HLA and MS is due to a causative role of HLA-molecules in the susceptibility to MS, or whether a separate gene, not functionally involved in the HLA-system, plays a role. The HLA-system plays an important role in the immune system. Among other functions, it is involved in antigen-presentation by macrophages to lymphocytes, after which the specific immunological reaction to this antigen can start. A preference of HLA-genes to present specific auto- antigens could make them indeed functionally involved in MS. Specific HLA-types may provide a selective evolutionary advantage from the point of view of infectious disease despite the fact that they make the individual more susceptible to an auto-immune disease. Another possibility is that a gene that is genetically close to the HLA-genes could hitch hike with such a haplotype, even though on itself, it might give a selective evolutionary disadvantage. This way, also genes that contain a mutation that makes the individual susceptible to a disease could reach higher population frequencies than expected on the basis of their selective disadvantage. 2.3.6 Whole genome screens In 1996, three groups performed genome screens in multiplex families collected in Canada, the UK and the USA 144-146. They used (partly different) sets of polymorphic markers that spanned the whole genome. A multiple stage screen was performed in all three samples; the areas identified in a first screen were analyzed in a secondary screen by the Canadian and UK groups. An important and maybe surprising finding was, that there were no regions of interest that appeared strongly in all three initial screens. Overall, the contribution of all regions of interest in the initial screen appeared to be small 33 Chapter 2

in the combined datasets. Around 90% of the genome was excluded for a locus with rare alleles that contributes a sibling recurrence risk s >3. However, two regions stood out in these studies. The HLA region on chromosome 6p21 appears as the region with strongest linkage in the USA study (peak MLS 3.57) and the UK study (peak MLS 2.8), though in the latter study becoming apparent only in the secondary screen. In the Canadian study, the MLS is 0.65; evidence for linkage disequilibrium was found for one of three markers on chromosome 6, but not for the other two markers which were closer to the HLA DR and DQ loci. The second region of interest was located on the short arm of chromosome 5. A genomewide two-stage scan was done in multiplex families from an isolated Finnish region with high prevalence of MS 147. The population originates from a limited number of founders. This population structure was considered very useful in the study of complex disease, because of the likely restricted number of founder mutations. The first stage comprised multipoint, non-parametric linkage analysis of a low-resolution screen of 328 polymorphic markers. This did not reveal statistically significant linkage. There were however 10 regions of slight interest (p=0.1-0.15), of which 8 were analyzed in the second stage. In this stage, a denser marker screen was applied with average spacing of 4.5 cm (0.1-10cM). 5 multiplex families were added in this study. The results were analyzed with two-point parametric linkage analysis. Finally, the highest two-point LOD scores were found on chromosome 17 (Zmax=2.8, theta=0.04 under a dominant model). The UK genome screen 145 showed evidence for linkage on 17q22-q24 with the same markers. Allelic association with markers on 17q22-q24 was found in neither of the two studies. Since these first whole genome screens several genome screens have been conducted with increasing insight into the requirements of genetic studies in complex diseases (Table 2). Loci with LOD scores < 2 are not shown. 34 Genetic epidemiology of MS

35 Chapter 2 Conclusions on genome screens So far, only the HLA region has repeatedly shown significant association and linkage with MS. This is probably the consequence of sample sizes and marker densities having been still insufficient with regard to the complexity of the underlying genetic model. Genetic (locus and allelic) heterogeneity, reduced penetrance and contribution of environmental factors complicate the finding of susceptibility genes. It is to be expected that in the near future possibilities to find these genes will be much better with the availability of more of SNP markers and high through-put genotyping. However, funding will have to be sufficient as well, since large numbers of DNA s amples will have to be examined. In order to obtain these large numbers of DNA samples, national and international cooperation will be required.

Table2: Whole genome screens in Multiple Sclerosis Study, population Patients Markers Average separation Statistical methods Loci (MLS) Sawcer 145 1996 UK Stage 1: 129 Stage 1: 311 multiplex microsatellitemarkers; families, stage 2: 98 stage 2: 44 multiplex markers families, total 251 affected sibpairs 12 cm Linkage analysis (MAPMAKER/ SIBS) maximum likelihood sharing probabilities for each sibpair, MLS Stage 1: 19 loci with MLS>0.7 (nominal significance 5%); 6 loci with MLS >1.8; stage 2: 17q22 (2.7), 6p21 (2.8) Haines 146 1996 US Stage 1: 52 multiplex families, 443 stage 2: 23 multiplex families, total 126 affected sibpairs and 88 other affected relative pairs 9.6 cm Linkage (FASTLINK), sibpair (SAGE/SIB- PAL), affected relative pair (SimIBD) (significant on three tests) 7q21-22 (2.86) Ebers 144 1996 Canada Stage 1: 61 multiplex families, 100 microsatellite Stage 1: 257 affected sibpairs, markers, stage 2: stage 2: 44+78 5 microsatellite sibpairs markers on 5p and 3 on 6p21 in all 3 datasets 15.2 cm (stage 1) Multipoint linkage analysis and multipoint sibpair analysis, TDT Stage 1: 5 loci with MLS>1, of which D5S406 (4.24). Stage 2: 5p, 6p21 non-significant, D6S461 on TDT 2 10.8 (p<0.01) and 10.9 (p<0.0005) Kuokkanen 147 1997 Finland Stage 1: 16 multiplex families, stage 2: 21 multiplex families Stage 1: 328 microsatellite markers Stage 2: 4.5 cm Stage 1: Stage 1: no significant linkage, Multipoint nonparametric linkage analysis, (2.8) stage 2: 17q22-24 stage 2: twopoint parametric linkage analysis 36 Genetic epidemiology of MS

Study, population Patients Markers Average separation Statistical methods Loci (MLS) Broadley 148 2001 Italy 40 multiplex families with 37 sibpairs and three other relative pairs 321 microsatellite markers and HLA DRB1 Multipoint nonparametric linkage analysis (GENEHUNTER- PLUS), TDT No regions with genomewide significance Coraddu 149 2001 Sardinia 49 multiplex families, 46 sibpairs and 3 sibtrios 324 microsatellite markers, HLA DRB1, DQA1 and DQB1 Nonparametric linkage analysis (MAPMAKER/ SIBS) and TDT No regions with genomewide significance Dyment 150 2001 Canada 219 sibpairs 105 markers previously showing increased sharing Multipoint linkage analysis and transmission distortion (Aspex Statistical Package) 5p14 (2.27), transmission disequilibrium D17S789 (p=0.0015) The Transatlantic Multiple Sclerosis Genetics Cooperative 151 2001 US, UK, Canada US: 52 multiplex US: 442, UK: 314, families, 133 Canada: 257 affecteds, UK: 128 microsatellite families, 264 markers with affecteds, different overlap Canada: 61 families, 139 between screens affecteds Meta-analysis, non-parametric multipoint linkage (GENEHUNTER) No regions with genomewide significance Akesson 152 2002 Scandinavia 136 sib pairs 399 microsatellite markers 9.7 cm Multipoint nonparametric linkage analysis (MAPMAKER/ SIBS) No regions with genomewide significance Ban 153 2002 Australia 54 sib pairs 397 microsatellite markers Multipoint nonparametric linkage analysis (MAPMAKER/ SIBS) No regions with genomewide significance 37 Chapter 2

Study, population Patients Markers Average separation Statistical methods Loci (MLS) Haines 143 2002 US 1: 52 multiplex families with 135 affecteds, 2: 46 multiplex families with 131 affecteds 80 microsatellite markers in 19 previously determined regions Parametric and nonparametric linkage analysis (FASTLINK, HOMOG, ASPEX, ALLEGRO), stratification for HLA DR2 5 regions with LOD score >2.0; after stratification for DR2 in 2 additional regions He 121 2002 Sweden Stage 1: 5 MS patients from 4 families, stage 2: 15 MS patients and healthy relatives (genetically isolated population) Stage 1: 390 (380) microsatellite markers 10 cm TDT on haplotypes Stage 1: 7 fragments shared by >=4/8 patient chromosomes; stage 2: conserved haplotype of 10cM on 17p11 in 12/15 patients, 4 marker haplotype 6/15, p<0.01 Sawcer 154 2002 UK Pooled DNA of 1: 216 patients vs 219 controls and 2: 745 patients vs 1490 parents 6000 microsatellite markers 0.5 cm with gaps to >10cM Association analysis, Chi-square test 10 most promising markers p-value <5% Coraddu 155 2002 Sardinia Pooled DNA of 1: 229 MS patients vs 264 controls and 2: 235 trios of patient vs parents; all patients had RR MS 2764 microsatellite markers Association analysis, Chi-square test 5 markers with p-value <5% in both screens 38 Genetic epidemiology of MS

Study, population Patients Markers Average separation Statistical methods Loci (MLS) Goedde 156 2002 Germany Pooled DNA of 198 HLA-DR15+ MS patients and 198 controls 6000 (4666) microsatellite markers 0.75 Mb Association analysis, Chi-square test 87 markers with nominal p-value<0.05 (not corrected for multiple testing) Ban 157 2003 Australia Pooled DNA of 217 HLA DR15 positive MS patients and 187 controls 6000 (4346) microsatellite markers 0.75 Mb Association analysis, Chi-square test 7 markers with p- value =< 1% Bielecki 158 2003 Poland Pooled DNA of 200 MS patients vs 200 controls and 129 trios 6000 (4696) microsatellite markers 0.75 Mb Association analysis, Chi-square test 5 markers with p-value <5% Eraksoy 159 2003 Turkey 43 multiplex 392 microsatellite markers families: 16 with >=2 sibs, 27 other relative pairs (consanguinity in 13 families). Stage 1: 92 affecteds, stage 2: 78 unaffecteds (45 markers) 10 cm Linkage analysis (GENEHUNTER- PLUS), No regions with genomewide significance Eraksoy 160 2003 Turkey Pooled DNA of 197 patients with RR/SP MS vs 199 controls 6000 (4359) microsatellite markers Association analysis, Chisquare test 12 regions with empirical p-value <5%; 5p15 confirmed by linkage 39 Chapter 2

Study, population Patients Markers Average separation Statistical methods Loci (MLS) Giedraitis 161 2003 Sweden 54 MS patients 1040 (834) and 114 healthy microsatellite family members markers (genetically isolated population) 4.4 cm Marker- and haplotype-based transmission dis- 7 regions in (TRANSMIT) TDT, equilibrium with nonparametric MS, 1 strongly linkage analysis (14q24-32); (Genehunter) lodscores around 2 for several regions; both on 17q12-24 Goris 162 2003 Belgium Pooled DNA of 204 MS patients vs 198 controls and 131 trios 6000 (4875) microsatellite markers Association analysis, Chi-square test 27 markers p-value<5% Heggarty 163 2003 Northern Ireland Pooled DNA of 200 MS patients vs 200 controls 6000 (2537) microsatellite markers Association analysis, Chi-square test 22 markers p-value< 5%, 5 markers p-value <1% (of which 2 in HLA region) Hensiek 164 2003 UK Previous 129 multiplex families + 97 multiplex families, analyzed together and stratified on HLA- DRB1*1501 242 microsatellite markers (+111 previous) 12 cm Multipoint nonparametric linkage analysis (MAPMAKER/ SIBS) No regions with genomewide significance Laaksonen 165 2003 Finland Pooled DNA of 195 patients vs 205 controls 5532 (5522) microsatellite markers 0.75 Mb Association analysis, Chi-square test 108 markers with hypothetical p- value<5%, 5 genomic regions with markers <1 cm apart 40 Genetic epidemiology of MS

Study, population Patients Markers Average separation Statistical methods Loci (MLS) Liguori 165 2003 Italy Pooled DNA of 224 MS patients vs 231 controls and 185 trios 6000 (4789) microsatellite markers 0.75 Mb Association analysis, Chi-square test D2S367 (p-value 0.015) Santos 166 2003 Portugal Pooled DNA 188 MS patients vs 188 controls 6000 (4661) microsatellite markers 0.75 Mb Association analysis, Chi-square test 10 markers empirical p-value <0.01 Weber 167 2003 Germany Pooled DNA of 234 MS patients vs 209 controls and 68 trios 6000 microsatellite markers 0.75 Mb Association analysis, Chi-square test 11 markers with empirical p-value <5% GAMES and Transatlantic Multiple Sclerosis Genetics Cooperative 76 2003 >700 multiplex families Around 6000 microsatellite markers 0.75 Mb Meta-analysis of raw genotypes, linkage analysis Genomewide significance in the MHC region, not outside it. Limited accuracy of data Kenealy 168 2004 US and France 456 affected relative pairs from 245 families 390 microsatellite markers <10cM Parametric and non-parametric analysis of linkage (FASTLINK and HOMOG), score pairs and exponential model (Allegro) No regions with genomewide significance Dyment 169 2004 Canadian 522 sibpairs from 442 families (stage 1: 219, stage 2: 333 sibpairs) Stage 1: 498 and stage 2: 35 microsatellite markers 7 cm Multipoint linkage analysis and transmission distortion (Aspex Statistical Package) Only linkage on 6p (4.40) Suggestive 2q27 (2.27) and 5p15 (2.09) 41 Chapter 2

Study, population Patients Markers Average separation Statistical methods Loci (MLS) Goedde 170 2005 Germany Pop1 and pop2: 100 MS patients (85% HLA DR2+) vs 100 unstratified controls; pop3 180 MS patients vs 180 controls, both unstratified 11.555 SNPs on DNA chips 105 kb (mean genetic distance 0.32 cm, gaps on chr. 16,17 and 19) Association analysis, exact Fisher test for two-by-three tables; Bonferroni correction for multiple testing SNPs in the vicinity of HLA-DRA show strong association (p-value 4x10-10 (extended haplotype?); no other significant association after Bonferroni correction International Multiple Sclerosis Genetics Consortium 75 2005 Northern European 2692 individuals from 730 multiplex families: 1595 affecteds of whom 830 sib pairs and 172 other relative pairs 5858 (4506) SNPs <500 kb Multipoint nonparametric linkage analysis 6p21 (11.66), 17q23 (2.45), 5q33 (2.18) Reich 91 2005 African Americans 605 patients and 1043 controls 1555 Association analysis/ Admixture mapping Centromere 1 (5.2) 42 Genetic epidemiology of MS

References 1. Pittock SJ, Mayr WT, McClelland RL, Jorgensen NW, Weigand SD, Noseworthy JH et al. Change in MS-related disability in a population-based cohort: a 10-year follow-up study. Neurology 2004; 62(1):51-59. 2. Ebers GC. Prognostic factors for multiple sclerosis: the importance of natural history studies. J Neurol 2005; 252 Suppl 3:iii15-iii20. 3. Hafler DA, Slavik JM, Anderson DE, O'Connor KC, De Jager P, Baecher-Allan C. Multiple sclerosis. Immunol Rev 2005; 204:208-231. 4. Chaudhuri A, Behan PO. Multiple sclerosis is not an autoimmune disease. Arch Neurol 2004; 61(10):1610-1612. 5. Kurtzke JF. MS epidemiology world wide. One view of current status. Acta Neurol Scand Suppl 1995; 161:23-33. 6. Poser CM. The epidemiology of multiple sclerosis: a general overview. Ann Neurol 1994; 36 Suppl 2:S180-S193. 7. Kurtzke JF, Dean G, Botha DP. A method for estimating the age at immigration of white immigrants to South Africa, with an example of its importance. S Afr Med J 1970; 44(23):663-669. 8. Kurtzke JF. Epidemiologic evidence for multiple sclerosis as an infection [published erratum appears in Clin Microbiol Rev 1994 Jan;7(1):141]. Clin Microbiol Rev 1993; 6(4):382-427. 9. Kurtzke JF, Delasnerie-Laupretre N, Wallin MT. Multiple sclerosis in North African migrants to France. Acta Neurol Scand 1998; 98(5):302-309. 10. Dean G, Kurtzke JF. On the risk of multiple sclerosis according to age at immigration to South Africa. Br Med J 1971; 3(777):725-729. 11. Elian M, Nightingale S, Dean G. Multiple sclerosis among United Kingdom-born children of immigrants from the Indian subcontinent, Africa and the West Indies. J Neurol Neurosurg Psychiatry 1990; 53(10):906-911. 12. Kurtzke JF, Hyllested K. Multiple sclerosis in the Faroe Islands: I. Clinical and epidemiological features. Ann Neurol 1979; 5(1):6-21. 13. Kurtzke JF, Hyllested K, Heltberg A. Multiple sclerosis in the Faroe Islands: transmission across four epidemics. Acta Neurol Scand 1995; 91(5):321-325. 14. Poser CM, Hibberd PL. Analysis of the 'epidemic' of multiple sclerosis in the Faroe Islands. II. Biostatistical aspects. Neuroepidemiology 1988; 7(4):181-189. 15. Poser CM, Hibberd PL, Benedikz J, Gudmundsson G. Analysis of the 'epidemic' of multiple sclerosis in the Faroe Islands. I. Clinical and epidemiological aspects. Neuroepidemiology 1988; 7(4):168-180. 16. Alotaibi S, Kennedy J, Tellier R, Stephens D, Banwell B. Epstein-Barr virus in pediatric multiple sclerosis. JAMA 2004; 291(15):1875-1879. 43 Chapter 2

17. Croxford JL, Olson JK, Anger HA, Miller SD. Initiation and exacerbation of autoimmune demyelination of the central nervous system via virus-induced molecular mimicry: implications for the pathogenesis of multiple sclerosis. J Virol 2005; 79(13):8581-8590. 18. Marrie RA. Environmental risk factors in multiple sclerosis aetiology. Lancet Neurol 2004; 3(12):709-718. 19. Eichhorst H. Uber infantile und heriditare multiple sclerose. Virchow's Arch Path Anat 1896; 146:173-192. 20. Sadovnick AD, Dircks A, Ebers GC. Genetic counselling in multiple sclerosis: risks to sibs and children of affected individuals. Clin Genet 1999; 56(2):118-122. 21. Grasso MG, Frontali M, Bernardi S, Pantano P, Fieschi C. Multifactorial inheritance and recurrence risks of multiple sclerosis in Italian patients. Neuroepidemiology 1989; 8(6):300-307. 22. Robertson NP, Fraser M, Deans J, Clayton D, Walker N, Compston DA. Age-adjusted recurrence risks for relatives of patients with multiple sclerosis. Brain 1996; 119 ( Pt 2):449-455. 23. Carton H, Vlietinck R, Debruyne J, De Keyser J, D'Hooghe MB, Loos R et al. Risks of multiple sclerosis in relatives of patients in Flanders, Belgium. J Neurol Neurosurg Psychiatry 1997; 62(4):329-333. 24. Sadovnick AD, Macleod PM. The familial nature of multiple sclerosis: empiric recurrence risks for first, second-, and third-degree relatives of patients. Neurology 1981; 31(8):1039-1041. 25. Sadovnick AD, Baird PA. The familial nature of multiple sclerosis: age-corrected empiric recurrence risks for children and siblings of patients. Neurology 1988; 38(6):990-991. 26. Sadovnick AD, Baird PA, Ward RH. Multiple sclerosis: updated risks for relatives. Am J Med Genet 1988; 29(3):533-541. 27. Sadovnick AD. Familial recurrence risks and inheritance of multiple sclerosis. Current Opinion in Neurology and Neurosurgery 1993; 6:189-194. 28. Sadovnick AD, Ebers GC, Dyment DA, Risch NJ. Evidence for genetic basis of multiple sclerosis. The Canadian Collaborative Study Group. Lancet 1996; 347(9017):1728-1730. 29. Prokopenko I, Montomoli C, Ferrai R, Musu L, Piras ML, Ticca A et al. Risk for relatives of patients with multiple sclerosis in central Sardinia, Italy. Neuroepidemiology 2003; 22(5):290-296. 30. Ebers GC, Sadovnick AD, Risch NJ. A genetic basis for familial aggregation in multiple sclerosis. Canadian Collaborative Study Group. Nature 1995; 377(6545):150-151. 31. Sadovnick AD, Armstrong H, Rice GP, Bulman D, Hashimoto L, Paty DW et al. A population-based study of multiple sclerosis in twins: update. Ann Neurol 1993; 33(3):281-285. 44 Genetic epidemiology of MS

32. Bobowick AR, Kurtzke JF, Brody JA, Hrubec Z, Gillespie M. Twin study of multiple sclerosis: an epidemiologic inquiry. Neurology 1978; 28(10):978-987. 33. Ebers GC, Bulman DE, Sadovnick AD, Paty DW, Warren S, Hader W et al. A population-based study of multiple sclerosis in twins. N Engl J Med 1986; 315(26):1638-1642. 34. Heltberg A. Twin studies in multiple sclerosis. Ital J Neurol Sci 1987; Suppl 6:35-39. 35. Mumford CJ, Wood NW, Kellar-Wood H, Thorpe JW, Miller DH, Compston DA. The British Isles survey of multiple sclerosis in twins. Neurology 1994; 44(1):11-15. 36. Williams A, Eldridge R, McFarland H, Houff S, Krebs H, McFarlin D. Multiple sclerosis in twins. Neurology 1980; 30(11):1139-1147. 37. Willer CJ, Dyment DA, Risch NJ, Sadovnick AD, Ebers GC. Twin concordance and sibling recurrence rates in multiple sclerosis. Proc Natl Acad Sci U S A 2003; 100(22):12877-12882. 38. Thorpe JW, Mumford CJ, Compston DA, Kendall BE, MacManus DG, McDonald WI et al. British Isles survey of multiple sclerosis in twins: MRI. J Neurol Neurosurg Psychiatry 1994; 57(4):491-496. 39. Machin GA. Some causes of genotypic and phenotypic discordance in monozygotic twin pairs. Am J Med Genet 1996; 61(3):216-228. 40. Fraga MF, Ballestar E, Paz MF, Ropero S, Setien F, Ballestar ML et al. From The Cover: Epigenetic differences arise during the lifetime of monozygotic twins. Proc Natl Acad Sci U S A 2005; 102(30):10604-10609. 41. Vyshkina T, Banisor I, Shugart YY, Leist TP, Kalman B. Genetic variants of Complex I in multiple sclerosis. J Neurol Sci 2005; 228(1):55-64. 42. Kalman B, Leist TP. A mitochondrial component of neurodegeneration in multiple sclerosis. Neuromolecular Med 2003; 3(3):147-158. 43. Ebers GC, Sadovnick AD, Dyment DA, Yee IM, Willer CJ, Risch N. Parent-of-origin effect in multiple sclerosis: observations in half-siblings. Lancet 2004; 363(9423):1773-1774. 44. Robertson NP, O'Riordan JI, Chataway J, Kingsley DP, Miller DH, Clayton D et al. Offspring recurrence rates and clinical characteristics of conjugal multiple sclerosis. Lancet 1997; 349(9065):1587-1590. 45. Ebers GC, Yee IM, Sadovnick AD, Duquette P. Conjugal multiple sclerosis: population-based prevalence and recurrence risks in offspring. Canadian Collaborative Study Group. Ann Neurol 2000; 48(6):927-931. 46. Callander M, Landtblom AM. A cluster of multiple sclerosis cases in Lysvik in the Swedish county of Varmland. Acta Neurol Scand 2004; 110(1):14-22. 47. Koch MJ, Reed D, Stern R, Brody JA. Multiple sclerosis. A cluster in a small Northwestern United States community. JAMA 1974; 228(12):1555-1557. 48. Sadovnick AD, Yee IM, Ebers GC. Recurrence risks to sibs of MS index cases: impact of consanguineous matings. Neurology 2001; 56(6):784-785. 45 Chapter 2

49. Lynch SG, Rose JW, Smoker W, Petajan JH. MRI in familial multiple sclerosis. Neurology 1990; 40(6):900-903. 50. Tienari PJ, Salonen O, Wikstrom J, Valanne L, Palo J. Familial multiple sclerosis: MRI findings in clinically affected and unaffected siblings. J Neurol Neurosurg Psychiatry 1992; 55(10):883-886. 51. Constantinescu CS, Grossman RI, Finelli PF, Kamoun M, Zmijewski C, Cohen JA. Clinical and subclinical neurological involvement in children of conjugal multiple sclerosis patients. Mult Scler 1995; 1(3):170-172. 52. Fulton JC, Grossman RI, Mannon LJ, Udupa J, Kolson DL. Familial multiple sclerosis: volumetric assessment in clinically symptomatic and asymptomatic individuals. Mult Scler 1999; 5(2):74-77. 53. Duquette P, Charest L. Cerebrospinal fluid findings in healthy siblings of multiple sclerosis patients. Neurology 1986; 36(5):727-729. 54. Xu XH, McFarlin DE. Oligoclonal bands in CSF: twins with MS. Neurology 1984; 34(6):769-774. 55. Haghighi S, Andersen O, Rosengren L, Bergstrom T, Wahlstrom J, Nilsson S. Incidence of CSF abnormalities in siblings of multiple sclerosis patients and unrelated controls. J Neurol 2000; 247(8):616-622. 56. Nuwer MR, Visscher BR, Packwood JW, Namerow NS. Evoked potential testing in relatives of multiple sclerosis patients. Ann Neurol 1985; 18(1):30-34. 57. Gilbert JJ, Sadler M. Unsuspected multiple sclerosis. Arch Neurol 1983; 40(9):533-536. 58. Heinsen H, Lockemann U, Puschel K. Unsuspected (clinically silent) multiple sclerosis. Quantitative investigations in one autoptic case. Int J Legal Med 1995; 107(5):263-266. 59. Kurtzke JF, Beebe GW, Norman JE, Jr. Epidemiology of multiple sclerosis in U.S. veterans: 1. Race, sex, and geographic distribution. Neurology 1979; 29(9 Pt 1):1228-1235. 60. Chakravarti A. Population genetics--making sense out of sequence. Nat Genet 1999; 21(1 Suppl):56-60. 61. Boon M, Nolte IM, De Keyser J, Buys CH, te Meerman GJ. Inheritance mode of multiple sclerosis: the effect of HLA class II alleles is stronger than additive. Hum Genet 2004; 115(4):280-284. 62. Lindsey JW. Familial recurrence rates and genetic models of multiple sclerosis. Am J Med Genet A 2005; 135(1):53-58. 63. Kruglyak L, Lander ES. Complete multipoint sib-pair analysis of qualitative and quantitative traits. Am J Hum Genet 1995; 57(2):439-454. 64. Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES. Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 1996; 58(6):1347-1363. 46 Genetic epidemiology of MS

65. Nolte IM, te Meerman GJ. Comparison of multilocus association methods for fine-mapping of disease gene loci involved in qualitative traits. In: Statistics and population genetics of haplotype sharing as a tool for fine-mapping of disease gene loci (thesis). University Medical Center Groningen, University of Groningen, The Netherlands, 2003. 66. Wallenstein S, Hodge SE, Weston A. Logistic regression model for analyzing extended haplotype data. Genet Epidemiol 1998; 15(2):173-181. 67. Haines JL, Terwedow HA, Burgess K, Pericak-Vance MA, Rimmler JB, Martin ER et al. Linkage of the MHC to familial multiple sclerosis suggests genetic heterogeneity. The Multiple Sclerosis Genetics Group. Hum Mol Genet 1998; 7(8):1229-1234. 68. Dyment DA, Ebers GC, Sadovnick AD. Genetics of multiple sclerosis. Lancet Neurol 2004; 3(2):104-110. 69. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG et al. The sequence of the human genome. Science 2001; 291(5507):1304-1351. 70. Boehnke M. Limits of resolution of genetic linkage studies: implications for the positional cloning of human disease genes. Am J Hum Genet 1994; 55(2):379-390. 71. Lander ES. The new genomics: global views of biology. Science 1996; 274(5287):536-539. 72. Altmuller J, Palmer LJ, Fischer G, Scherb H, Wjst M. Genomewide scans of complex human diseases: true linkage is hard to find. Am J Hum Genet 2001; 69(5):936-950. 73. Kruglyak L, Lander ES. High-resolution genetic mapping of complex traits Am J Hum Genet 1995; 56(5):1212-1223. 74. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science 1996; 273(5281):1516-1517. 75. Sawcer S, Ban M, Maranian M, Yeo TW, Compston A, Kirby A et al. A high-density screen for linkage in multiple sclerosis. Am J Hum Genet 2005; 77(3):454-467. 76. A meta-analysis of whole genome linkage screens in multiple sclerosis. J Neuroimmunol 2003; 143(1-2):39-46. 77. Gambaro G, Anglani F, D'Angelo A. Association studies of genetic polymorphisms and complex disease. Lancet 2000; 355(9200):308-311. 78. Hunter DJ. Gene-environment interactions in human diseases. Nat Rev Genet 2005; 6(4):287-298. 79. Falk CT, Rubinstein P. Haplotype relative risks: an easy reliable way to construct a proper control sample for risk calculations. Ann Hum Genet 1987; 51 ( Pt 3):227-233. 80. Thomson G. Mapping disease genes: family-based association studies. Am J Hum Genet 1995; 57(2):487-498. 81. Barcellos LF, Klitz W, Field LL, Tobias R, Bowcock AM, Wilson R et al. Association mapping of disease loci, by use of a pooled DNA genomic screen. Am J Hum Genet 1997; 61(3):734-747. 47 Chapter 2

82. Godde R, Nigmatova V, Jagiello P, Sindern E, Haupts M, Schimrigk S et al. Refining the results of a whole-genome screen based on 4666 microsatellite markers for defining predisposition factors for multiple sclerosis. Electrophoresis 2004; 25(14):2212-2218. 83. Yeo TW, Roxburgh R, Maranian M, Singlehurst S, Gray J, Hensiek A et al. Refining the analysis of a whole genome linkage disequilibrium association map: the United Kingdom results. J Neuroimmunol 2003; 143(1-2):53-59. 84. Lander ES, Botstein D. Mapping complex genetic traits in humans: new methods using a complete RFLP linkage map. Cold Spring Harb Symp Quant Biol 1986; 51 Pt 1:49-62. 85. Lander ES, Botstein D. Homozygosity mapping: a way to map human recessive traits with the DNA of inbred children. Science 1987; 236(4808):1567-1570. 86. Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 1999; 22(2):139-144. 87. Carlson CS, Eberle MA, Kruglyak L, Nickerson DA. Mapping complex disease loci in whole-genome association studies. Nature 2004; 429(6990):446-452. 88. Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet 2003; 33 Suppl:228-237. 89. Wall JD, Pritchard JK. Haplotype blocks and linkage disequilibrium in the human genome. Nat Rev Genet 2003; 4(8):587-597. 90. Pritchard JK, Przeworski M. Linkage disequilibrium in humans: models and data. Am J Hum Genet 2001; 69(1):1-14. 91. Reich D, Patterson N, Jager PL, McDonald GJ, Waliszewska A, Tandon A et al. A whole-genome admixture scan finds a candidate locus for multiple sclerosis susceptibility. Nat Genet 2005; 37(10):1113-1118. 92. Perneger TV. What's wrong with Bonferroni adjustments. BMJ 1998; 316(7139):1236-1238. 93. Bender R, Lange S. Multiple test procedures other than Bonferroni's deserve wider use. BMJ 1999; 318(7183):600-601. 94. Bohringer S, Hardt C, Miterski B, Steland A, Epplen JT. Multilocus statistics to uncover epistasis and heterogeneity in complex diseases: revisiting a set of multiple sclerosis data. Eur J Hum Genet 2003; 11(8):573-584. 95. Ioannidis JP. Why most published research findings are false. PLoS Med 2005; 2(8):e124. 96. Suarez BK. The affected sib pair IBD distribution for HLA-linked disease susceptibility genes. Tissue Antigens 1978; 12(2):87-93. 97. Tierney C, McKnight B. Power of affected sibling method tests for linkage. Hum Hered 1993; 43(5):276-287. 48 Genetic epidemiology of MS

98. Weeks DE, Lange K. The affected-pedigree-member method of linkage analysis. Am J Hum Genet 1988; 42(2):315-326. 99. Weeks DE, Lange K. A multilocus extension of the affected-pedigree-member method of linkage analysis. Am J Hum Genet 1992; 50(4):859-868. 100. Risch N. Linkage strategies for genetically complex traits. II. The power of affected relative pairs. Am J Hum Genet 1990; 46(2):229-241. 101. Payami H, Louis EJ, Klitz W, Lo SK, Thomson G. Family and population analysis of multiple sclerosis. Genet Epidemiol Suppl 1986; 1:381-386. 102. Risch N. Genetic analysis workshop IV: summary of the multiple sclerosis workshop. Genet Epidemiol Suppl 1986; 1:371-380. 103. Stewart GJ, McLeod JG, Basten A, Bashir HV. HLA family studies and multiple sclerosis: A common gene, dominantly expressed. Hum Immunol 1981; 3(1):13-29. 104. Clerget-Darpoux F, Govaerts A, Feingold N. HLA and susceptibility to multiple sclerosis. Tissue Antigens 1984; 24(3):160-169. 105. Barcellos LF, Schito AM, Rimmler JB, Vittinghoff E, Shih A, Lincoln R et al. CCchemokine receptor 5 polymorphism and age of onset in familial multiple sclerosis. Multiple Sclerosis Genetics Group. Immunogenetics 2000; 51(4-5):281-288. 106. Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 1993; 52(3):506-516. 107. Ott J. Statistical properties of the haplotype relative risk. Genet Epidemiol 1989; 6(1):127-130. 108. Vandenbroeck K, Fiten P, Heggarty S, Goris A, Cocco E, Hawkins SA et al. Chromosome 7q21-22 and multiple sclerosis: evidence for a genetic susceptibility effect in vicinity to the protachykinin-1 gene. J Neuroimmunol 2002; 125(1-2):141-148. 109. Dyment DA, Steckley JL, Willer CJ, Armstrong H, Sadovnick AD, Risch N et al. No evidence to support CTLA-4 as a susceptibility gene in MS families: the Canadian Collaborative Study. J Neuroimmunol 2002; 123(1-2):193-198. 110. Marrosu MG, Schirru L, Fadda E, Mancosu C, Lai M, Cocco E et al. ICAM-1 gene is not associated with multiple sclerosis in sardinian patients. J Neurol 2000; 247(9):677-680. 111. Feakes R, Sawcer S, Broadley S, Coraddu F, Roxburgh R, Gray J et al. Interleukin 1 receptor antagonist (IL-1ra) in multiple sclerosis. J Neuroimmunol 2000; 105(1):96-101. 112. Coppin H, Ribouchon MT, Bausero P, Pessac B, Fontaine B, Semana G et al. No evidence for transmission disequilibrium between a new marker at the myelin basic protein locus and multiple sclerosis in French patients. Genes Immun 2000; 1(8):478-482. 49 Chapter 2

113. Ligers A, Dyment DA, Willer CJ, Sadovnick AD, Ebers G, Risch N et al. Evidence of linkage with HLA-DR in DRB1*15-negative families with multiple sclerosis. Am J Hum Genet 2001; 69(4):900-903. 114. Dyment DA, Herrera BM, Cader MZ, Willer CJ, Lincoln MR, Sadovnick AD et al. Complex interactions among MHC haplotypes in multiple sclerosis: susceptibility and resistance. Hum Mol Genet 2005; 14(14):2019-2026. 115. te Meerman GJ, van der Meulen MA, Sandkuijl LA. Perspectives of identity by descent (IBD) mapping in founder populations. Clin Exp Allergy 1995; 25 Suppl 2:97-102. 116. te Meerman GJ, van der Meulen MA. Genomic sharing surrounding alleles identical by descent: effects of genetic drift and population growth. Genet Epidemiol 1997; 14(6):1125-1130. 117. van der Meulen MA, te Meerman GJ. Haplotype sharing analysis in affected individuals from nuclear families with at least one affected offspring. Genet Epidemiol 1997; 14(6):915-920. 118. Nolte IM, Spijker GT, Boon M, Jansen RC, Postma DS, Buys CHCM et al. The Haplotype Sharing Statistic: finemapping of disease gene loci by comparing patients and controls for the length of haplotype sharing. In: Statistics and population genetics of haplotype sharing as a tool for fine-mapping of disease gene loci (thesis). University Medical Center Groningen, University of Groningen, 2003. 119. Nolte IM, te Meerman GJ. The probability that similar haplotypes are identical by descent. Ann Hum Genet 2002; 66(Pt 3):195-209. 120. Jorde LB. Linkage disequilibrium and the search for complex disease genes. Genome Res 2000; 10(10):1435-1444. 121. He B, Giedraitis V, Ligers A, Binzer M, Andersen PM, Forsgren L et al. Sharing of a conserved haplotype suggests a susceptibility gene for multiple sclerosis at chromosome 17p11. Eur J Hum Genet 2002; 10(4):271-275. 122. Hashimoto LL, Mak TW, Ebers GC. T cell receptor alpha chain polymorphisms in multiple sclerosis. J Neuroimmunol 1992; 40(1):41-48. 123. Hashimoto LL, Walter MA, Cox DW, Ebers GC. Immunoglobulin heavy chain variable region polymorphisms and multiple sclerosis susceptibility. J Neuroimmunol 1993; 44(1):77-83. 124. Boon M, Nolte IM, Bruinenberg M, Spijker GT, Terpstra P, Raelson J et al. Mapping of a susceptibility gene for multiple sclerosis to the 51 kb interval between G511525 and D6S1666 using a new method of haplotype sharing analysis. Neurogenetics 2001; 3(4):221-230. 125. Colhoun HM, McKeigue PM, Davey SG. Problems of reporting genetic associations with complex outcomes. Lancet 2003; 361(9360):865-872. 126. Jersild C, Svejgaard A, Fog T. HL-A antigens and multiple sclerosis. Lancet 1972; 1(7762):1240-1241. 50 Genetic epidemiology of MS

127. Olerup O, Hillert J. HLA class II-associated genetic susceptibility in multiple sclerosis: a critical evaluation. Tissue Antigens 1991; 38(1):1-15. 128. Hillert J, Gronning M, Nyland H, Link H, Olerup O. An immunogenetic heterogeneity in multiple sclerosis. J Neurol Neurosurg Psychiatry 1992; 55(10):887-890. 129. Lincoln MR, Montpetit A, Cader MZ, Saarela J, Dyment DA, Tiislar M et al. A predominant role for the HLA class II region in the association of the MHC region with multiple sclerosis. Nat Genet 2005; 37(10):1108-1112. 130. Allen M, Sandberg-Wollheim M, Sjogren K, Erlich HA, Petterson U, Gyllensten U. Association of susceptibility to multiple sclerosis in Sweden with HLA class II DRB1 and DQB1 alleles. Hum Immunol 1994; 39(1):41-48. 131. Haegert DG, Francis GS. HLA-DQ polymorphisms do not explain HLA class II associations with multiple sclerosis in two Canadian patient groups. Neurology 1993; 43(6):1207-1210. 132. Haegert DG, Muntoni F, Murru MR, Costa G, Francis GS, Marrosu MG. HLA-DQA1 and -DQB1 associations with multiple sclerosis in Sardinia and French Canada: evidence for immunogenetically distinct patient groups [see comments]. Neurology 1993; 43(3 Pt 1):548-552. 133. Hauser SL, Fleischnick E, Weiner HL, Marcus D, Awdeh Z, Yunis EJ et al. Extended major histocompatibility complex haplotypes in patients with multiple sclerosis. Neurology 1989; 39(2 Pt 1):275-277. 134. Coraddu F, Reyes-Yanez MP, Parra A, Gray J, Smith SI, Taylor CJ et al. HLA associations with multiple sclerosis in the Canary Islands. J Neuroimmunol 1998; 87(1-2):130-135. 135. Saruhan-Direskeneli G, Esin S, Baykan-Kurt B, Ornek I, Vaughan R, Eraksoy M. HLA- DR and -DQ associations with multiple sclerosis in Turkey. Hum Immunol 1997; 55(1):59-65. 136. Marrosu MG, Muntoni F, Murru MR, Spinicci G, Pischedda MP, Goddi F et al. Sardinian multiple sclerosis is associated with HLA-DR4: a serologic and molecular analysis. Neurology 1988; 38(11):1749-1753. 137. Kira J, Kanai T, Nishimura Y, Yamasaki K, Matsushita S, Kawano Y et al. Western versus Asian types of multiple sclerosis: immunogenetically and clinically distinct disorders. Ann Neurol 1996; 40(4):569-574. 138. Kalman B, Takacs K, Gyodi E, Kramer J, Fust G, Tauszik T et al. Sclerosis multiplex in gypsies. Acta Neurol Scand 1991; 84(3):181-185. 139. Caballero A, Alves-Leon S, Papais-Alvarenga R, Fernandez O, Navarro G, Alonso A. DQB1*0602 confers genetic susceptibility to multiple sclerosis in Afro- Brazilians. Tissue Antigens 1999; 54(5):524-526. 140. Oksenberg JR, Barcellos LF, Cree BA, Baranzini SE, Bugawan TL, Khan O et al. Mapping multiple sclerosis susceptibility to the HLA-DR locus in African Americans. Am J Hum Genet 2004; 74(1):160-167. 51 Chapter 2

141. Barcellos LF, Oksenberg JR, Green AJ, Bucher P, Rimmler JB, Schmidt S et al. Genetic basis for clinical expression in multiple sclerosis. Brain 2002; 125(Pt 1):150-158. 142. Chataway J, Feakes R, Coraddu F, Gray J, Deans J, Fraser M et al. The genetics of multiple sclerosis: principles, background and updated results of the United Kingdom systematic genome screen. Brain 1998; 121 ( Pt 10):1869-1887. 143. Haines JL, Bradford Y, Garcia ME, Reed AD, Neumeister E, Pericak-Vance MA et al. Multiple susceptibility loci for multiple sclerosis. Hum Mol Genet 2002; 11(19):2251-2256. 144. Ebers GC, Kukay K, Bulman DE, Sadovnick AD, Rice G, Anderson C et al. A full genome search in multiple sclerosis. Nat Genet 1996; 13(4):472-476. 145. Sawcer S, Jones HB, Feakes R, Gray J, Smaldon N, Chataway J et al. A genome screen in multiple sclerosis reveals susceptibility loci on chromosome 6p21 and 17q22. Nat Genet 1996; 13(4):464-468. 146. Haines JL, Ter Minassian M, Bazyk A, Gusella JF, Kim DJ, Terwedow H et al. A complete genomic screen for multiple sclerosis underscores a role for the major histocompatability complex. The Multiple Sclerosis Genetics Group [see comments]. Nat Genet 1996; 13(4):469-471. 147. Kuokkanen S, Gschwend M, Rioux JD, Daly MJ, Terwilliger JD, Tienari PJ et al. Genomewide scan of multiple sclerosis in Finnish multiplex families. Am J Hum Genet 1997; 61(6):1379-1387. 148. Broadley S, Sawcer S, D'Alfonso S, Hensiek A, Coraddu F, Gray J et al. A genome screen for multiple sclerosis in Italian families. Genes Immun 2001; 2(4):205-210. 149. Coraddu F, Sawcer S, D'Alfonso S, Lai M, Hensiek A, Solla E et al. A genome screen for multiple sclerosis in Sardinian multiplex families. Eur J Hum Genet 2001; 9(8):621-626. 150. Dyment DA, Willer CJ, Scott B, Armstrong H, Ligers A, Hillert J et al. Genetic susceptibility to MS: a second stage analysis in Canadian MS families. Neurogenetics 2001; 3(3):145-151. 151. A meta-analysis of genomic screens in multiple sclerosis. The Transatlantic Multiple Sclerosis Genetics Cooperative. Mult Scler 2001; 7(1):3-11. 152. Akesson E, Oturai A, Berg J, Fredrikson S, Andersen O, Harbo HF et al. A genomewide screen for linkage in Nordic sib-pairs with multiple sclerosis. Genes Immun 2002; 3(5):279-285. 153. Ban M, Stewart GJ, Bennetts BH, Heard R, Simmons R, Maranian M et al. A genome screen for linkage in Australian sibling-pairs with multiple sclerosis. Genes Immun 2002; 3(8):464-469. 154. Sawcer S, Maranian M, Setakis E, Curwen V, Akesson E, Hensiek A et al. A whole genome screen for linkage disequilibrium in multiple sclerosis confirms disease associations with regions previously linked to susceptibility. Brain 2002; 125 (Pt 6):1337-1347. 52 Genetic epidemiology of MS

155. Coraddu F, Lai M, Mancosu C, Cocco E, Sawcer S, Setakis E et al. A genome-wide screen for linkage disequilibrium in Sardinian multiple sclerosis. J Neuroimmunol 2003; 143(1-2):120-123. 156. Goedde R, Sawcer S, Boehringer S, Miterski B, Sindern E, Haupts M et al. A genome screen for linkage disequilibrium in HLA-DRB1*15-positive Germans with multiple sclerosis based on 4666 microsatellite markers. Hum Genet 2002; 111(3):270-277. 157. Ban M, Sawcer SJ, Heard RN, Bennetts BH, Adams S, Booth D et al. A genome-wide screen for linkage disequilibrium in Australian HLA-DRB1*1501 positive multiple sclerosis patients. J Neuroimmunol 2003; 143(1-2):60-64. 158. Bielecki B, Mycko MP, Tronczynska E, Bieniek M, Sawcer S, Setakis E et al. A whole genome screen for association in Polish multiple sclerosis patients. J Neuroimmunol 2003; 143(1-2):107-111. 159. Eraksoy M, Kurtuncu M, Akman-Demir G, Kilinc M, Gedizlioglu M, Mirza M et al. A whole genome screen for linkage in Turkish multiple sclerosis. J Neuroimmunol 2003; 143(1-2):17-24. 160. Eraksoy M, Hensiek A, Kurtuncu M, Akman-Demir G, Kilinc M, Gedizlioglu M et al. A genome screen for linkage disequilibrium in Turkish multiple sclerosis. J Neuroimmunol 2003; 143(1-2):129-132. 161. Giedraitis V, Modin H, Callander M, Landtblom AM, Fossdal R, Stefansson K et al. Genome-wide TDT analysis in a localized population with a high prevalence of multiple sclerosis indicates the importance of a region on chromosome 14q. Genes Immun 2003; 4(8):559-563. 162. Goris A, Sawcer S, Vandenbroeck K, Carton H, Billiau A, Setakis E et al. New candidate loci for multiple sclerosis susceptibility revealed by a whole genome association screen in a Belgian population. J Neuroimmunol 2003; 143(1-2):65-69. 163. Heggarty S, Sawcer S, Hawkins S, McDonnell G, Droogan A, Vandenbroeck K et al. A genome wide scan for association with multiple sclerosis in a N. Irish case control population. Journal of Neuroimmunology 2003; 143(1-2):93-96. 164. Hensiek AE, Roxburgh R, Smilie B, Coraddu F, Akesson E, Holmans P et al. Updated results of the United Kingdom linkage-based genome screen in multiple sclerosis. J Neuroimmunol 2003; 143(1-2):25-30. 165. Laaksonen M, Jonasdottir A, Fossdal R, Ruutiainen J, Sawcer S, Compston A et al. A whole genome association study in Finnish multiple sclerosis patients with 3669 markers. J Neuroimmunol 2003; 143(1-2):70-73. 166. Santos M, Pinto-Basto J, Rio ME, Sa MJ, Valenca A, Sa A et al. A whole genome screen for association with multiple sclerosis in Portuguese patients. J Neuroimmunol 2003; 143(1-2):112-115. 167. Weber A, Infante-Duarte C, Sawcer S, Setakis E, Bellmann-Strobl J, Hensiek A et al. A genome-wide German screen for linkage disequilibrium in multiple sclerosis. J Neuroimmunol 2003; 143(1-2):79-83. 53 Chapter 2

168. Kenealy SJ, Babron MC, Bradford Y, Schnetz-Boutaud N, Haines JL, Rimmler JB et al. A second-generation genomic screen for multiple sclerosis. Am J Hum Genet 2004; 75(6):1070-1078. 169. Dyment DA, Sadovnick AD, Willer CJ, Armstrong H, Cader ZM, Wiltshire S et al. An extended genome scan in 442 Canadian multiple sclerosis-affected sibships: a report from the Canadian Collaborative Study Group. Hum Mol Genet 2004; 13(10):1005-1015. 170. Godde R, Rohde K, Becker C, Toliat MR, Entz P, Suk A et al. Association of the HLA region with multiple sclerosis as confirmed by a genome screen using >10,000 SNPs on DNA chips. J Mol Med 2005; 83(6):486-494. 54 Genetic epidemiology of MS

Chapter 3 Mapping of a susceptibility gene for multiple sclerosis to the 51 kb interval between G511525 and D6S1666 using a new method of haplotype sharing analysis Maartje Boon Ilja M.Nolte Marcel Bruinenberg Geert T. Spijker Peter Terpstra John Raelson Jacques De Keyser Cees P. Zwanikken Miriam Hulsbeek Robert M.W. Hofstra Charles H.C.M. Buys Gerard J. te Meerman Neurogenetics (2001) 3(4):221-230 55

3.1 Abstract Multiple Sclerosis is a complex disease with a partly genetic origin. Although an association with specific HLA-types has been known for almost thirty years, the nature of this relationship has remained unclear. Furthermore, genetic resolution sufficient to implicate a specific gene in the HLA-region has not been achieved. Many loci in the HLA-region have been found significantly associated with MS, which is largely explained by the extended haplotype sharing and varying marker informativity of the region. We have determined 248 haplotypes of multiple sclerosis (MS) patients from the population of the northern Netherlands and 226 haplotypes of their relatives as controls using a set of 22 microsatellite markers covering the HLAregion. The data were analyzed using standard association methods and a new statistical method, Haplotype Sharing Statistics (HSS). HSS determines the extent of haplotype sharing for all pairs of haplotypes of patients and of controls and calculates the difference in mean haplotype sharing between patients and controls. Haplotype sharing was found to be significantly greater among patients than among controls in a region of 1.1 Mb between markers G511525 and TNFa. This region is also supported by association analysis and Transmission/Disequilibrium Test (TDT). Within this region, HSS, which is largely independent of association and TDT, indicated the interval of 51 kb between G511525 and D6S1666 as the interval most likely to contain a susceptibility gene for MS. According to our present knowledge DQB1 is the sole gene in this interval. Therefore, the results of our analysis suggest that this gene plays a role in the pathogenesis of MS. Keywords: multiple sclerosis genes haplotype sharing Human leukocyte antigens Linkage Disequilibrium 56 Mapping of a susceptibility gene

3.2 Introduction Multiple sclerosis (MS [MIM 126200]) is characterized by inflammation and demyelination within the central nervous system. The cause of MS is unknown, but there is evidence for both genetic and environmental factors 1. MS segregation studies have reported a high relative risk for family members 2-4, strong association with HLA-DR and -DQ alleles 5-10 and limited segregation distortion 11,12. The association between MS and specific alleles of the Human Leukocyte Antigen (HLA)-system has been known for many years 13,14 and has more recently been confirmed in genome screens 15,16. However, it is still unclear whether the HLA-system itself plays a functional role in susceptibility to MS, or whether a separate gene, in linkage disequilibrium with HLA, is involved 17,18. The HLA region on the short arm of chromosome 6 is characterized by inheritance of very long blocks of DNA, conserved over many centimorgans through thousands of meioses and by potential hotspots for recombination between some of these blocks 19,20 reviewed by Carrington 21. This has also become evident from the genetic analysis of for instance hemochromatosis 22-24 and psoriasis 25. The interpretation of such extended haplotype sharing is difficult. Is there a strong founder effect of recent mutations flanked by long stretches of co-inherited DNA because of extremely low local recombination values, or is the extended sharing due to the presence in the region of more than one gene contributing to the disease, or is there selective advantage of specific combinations of alleles on one haplotype? In order to further investigate this question, we have used a new method for quantitative analysis of haplotype similarity, analyzing differences between haplotypes present in patients and either non-transmitted haplotypes present in parents or spousal control haplotypes 26-28. The theoretical background of the method and its validation by simulation studies and by the application to real data sets are described by Nolte et al. 29. 57 Chapter 3

3.3 Subjects and Methods Patients and controls A total of 400 MS-patients from the outpatient-population of the University Hospital Groningen and the Martini Hospital in Groningen were contacted by letter. 124 MS patients with ancestry within the three northeast provinces of the Netherlands were selected for initial analysis. All patients were diagnosed with MS according to standard criteria 30. DNA from parents was available in 71 families. In these families, haplotypes that were not transmitted from the parent to the affected child served as controls. Some patients had only one living parent, which reduced the number of control haplotypes relative to the number of patient haplotypes (248 patient and 226 control haplotypes). DNA from children, spouses or sibs was available in 53 families. In these cases, the chromosomes of the spouse or non-shared chromosomes of sibs were used for control haplotypes. For phase determination, DNA of the available relatives was used. The Ethical Committees of the University Hospital Groningen and the Martini Hospital approved the study. All participants gave their informed consent; children under 18 were excluded. Genotyping DNA was extracted from 20 ml of EDTA-blood following standard procedures 31. All patients and controls were genotyped using a set of 22 polymorphic markers (see figure 1). Oligonucleotide primers were chosen (table 1) and marker order was determined according to the available maps (Stanford RH map, Whitehead Institute / M.I.T. map, NCBI HGSI map, Généthon map 32,map around HFE 22, Sanger Institute physical map chromosome 6, map by R. Raha-Chowdhury (personal communication), Genemap 98 33,complete sequence of the HLA-region 34.For markers located outside the sequenced area we used the order that appeared most strongly supported according to the sources (figure 1). One oligonucleotide primer for each marker was 5 end-labeled with FAM, HEX, TET or NED fluorochromes. Amplification was performed on a 9700 (PE Biosystems) or a PTC-225 (MJ Research) thermal cycler. Reaction mixtures of 10 µl contained 60 ng genomic DNA, 2 ng of each primer, 0.2 mm dntps (Amersham-Pharmacia), 2.5 mm MgCl2, 50 mm KCl, 10 mm Tris-HCl ph 8.3 and 0.25 Units AmpliTaq Gold (PE Biosystems) or 0.25 Units Taq DNA polymerase (Roche). Cycling started with denaturation at 96ºC for 10 minutes, followed by 30 cycles each consisting of denaturation at 96ºC for 30 seconds, annealing at 55ºor 57ºC for 30 seconds and extension at 72ºC for 1 minute. The last cycle ended with extension at 72ºC for 30 minutes. Fragments were 58 Mapping of a susceptibility gene

Physical map of the HLA-region (3,838,986 bp from Sanger Centre) D6S1560 RING3CA D6S2445 TAP1 D6S2444 G511525 D6S1666 D6S273 TNFa DPB1 DPA1 TAP1 DOB DQA2 DQA1 LMP7 DQB2 DQB2 DRB1 TAP2 LTA HLA-B HLA-C -4 0 1 2 Mb 3 4 5 Mb D6S1558 D6S1621 D6S1281 D6S1545 D6S276 D6S299 D6S1691 D6S461 D6S265 D6S478 D6S1683 D6S306 D6S464 HLA-E HLA-A G F MOG HFE 6 7 8 Mb Figure 1: Map of markers and genes in the region Map constructed according to published sequence of the MHC region 33 and other available sources (see Subjects and Methods) 22,31.The positions of the markers used for the analysis have been indicated above the bar, gene positions below the bar. The precise position of the markers outside the sequenced region (shaded part of bar) is not known, but the order shown is supported by most of the sources. The distances indicated at the scale are in Mb. pooled based on expected length and the fluorescent label used. Electrophoresis was performed on either an ABI 377 (PE Biosystems) or a MegaBACE 1000 (Amersham-Pharmacia) automated sequencer. Data from the ABI 377 were analyzed using Genescan 2.0 and Genotyper 1.1 software packages. The MegaBACE traces were reviewed with the Genetic Profiler 1.0 software packages. The DNA sequence of the interval between markers G511525 and D6S1666 was screened for open reading frames (ORFs) using the BLAST programs against the expressed sequence tag (EST) database and against the Genbank database (both at NCBI). 59 Chapter 3

Table 1: Markers and primersequences Locus Marker Forward primer Reverse primer 1 D6S1560 CTCCAGTCCCCACTGC CCCAAGGCCACATAGC 2 RING3CA TGCTTATAGGGAGACTACCG GATGGGAAGTTTCCAGAGTG 3 D6S2445 AATATGATGGAAGAAGTAATCCAG GGATTACAGGTATAAGCCATTG 4 TAP1 GCTTTGATCTCCCCCCTC GGACAATATTTTGCTCCTGAGG 5 D6S2444 GAGCCAAGAACCCAGCATTC GGAAGGATTCTAAATAGGGGAG 6 G511525 GGTAAAATTCCTGACTGGCC GACAGCTCTTCTTAACCTGC 7 D6S1666 CTGAGTTGGGCAGCATTTG ACCCAGCATTTTGGAGTTG 8 D6S273 GCAACTTTTCTGTCAATCCA ACCAAACTTCAAATTTTCGG 9 TNFa GCCTCTAGATTTCATCCAGCCACA CCTCTCTCCCCTGCAACACACA 10 D6S265 ACGTTCGTACCCATTAACCT ATCGAGGTAAACAGCAGAAA 11 D6S478 CCTCCATAATTGTGTGAGCC CCAATCTTCTAACCCAAGCA 12 D6S1683 CTGCACATGTATCCGAGAA TTTNAAGTAGAGACAGGATTTCTTG 13 D6S306 TTTACTTCTGTTGCCTTAATG TGAGAGTTTCAGTGAGCC 14 D6S464 TGCTCCATTGCACTCC CTGATCACCCTCGATATTTTAC 15 D6S1558 GCTACTTGGGAGGCTGGAC CTGGCAGGAGGGCTAGTG 16 D6S1621 AAAGATTTAGAGTAAATGCTGATGA ACCACAGATGAGAATGCCTT 17 D6S1281 GATGCCACGTTTTAAAATGC AGAAGCAGCTGTGCTTTGTT 18 D6S1545 AATCTATGCTCCTGGGTTG GAAGTTCTGGAAATACAGCCTC 19 D6S276 TCAATCAAATCATCCCCAGAAG GGGTGCAACTTGTTCCTCCT 20 D6S299 AGGTCATTGTGCCAGG TGTCTATGTATACTCCTGAATGTCT 21 D6S1691 AGGACAGAATTTTGCCTC GCTGCTCCTGTATAAGTAATAAAC 22 D6S461 TATGACTTCTGGACAGTTAGGGG ACAACCCATCAGCCCACT Theoretical background For analysis of the data we used haplotype sharing statistics (HSS), extensively described by I.M. Nolte et al. 29 and summarized here. It is likely that part of the patients with a similar disease have inherited their predisposing mutation from a common ancestor, especially in a founder population. Therefore, DNA flanking the mutation on both sides will be identical by descent in these patients. The length of the DNA segment that is identical depends on the number of recombinations that have taken place on either side of the mutation since it occurred. This number depends mainly on the number of meioses that have taken place and on the recombination frequency of the region concerned 35,36. The number of meioses between patients, selected from a population conditional on their disease, is likely to be smaller than the number of meioses between a more or less random sample of controls from that population. Therefore, the length of the identical or shared 60 Mapping of a susceptibility gene

DNA sequence comprising a predisposing mutation in patients is likely to be larger than the length of the shared DNA sequence comprising the same locus in controls, that most often carry (older) wildtype alleles. The difference between patients and controls in sharing of DNA flanking a specific locus can thus be used as an indication for involvement of this locus in the disease susceptibility. Methods The sharing of two haplotypes at a marker locus is defined as the number of intervals between consecutive markers that flank the locus on both sides and that show similar alleles on the two haplotypes. The sharing is evaluated at each marker locus for all pairs of haplotypes of patients, for all pairs of haplotypes of controls and for mixed pairs. Linkage disequilibrium (LD) is determined in order to explore the identity by descent (IBD) status of similar haplotypes. If LD is strong, the probability of two identical haplotypes to be IBD as opposed to identical by state (IBS) is higher. If haplotypes are IBS and not IBD, haplotype sharing is random. In that case, the power of analyses based on haplotype sharing to identify regions of interest will be small. As a test of LD, the statistical significance of the sharing of the observed haplotypes can be evaluated by a randomization test. For this test, the observed alleles are redistributed over their loci, haplotype sharing is calculated and the results are used as a randomization statistic. Using a t-test, the haplotype sharing observed in the data is compared with the haplotype sharing in the randomized data in which there is linkage equilibrium (LE) between all marker loci. The variance of haplotype sharing under LE is estimated by repeated randomization. For fine mapping, a test called SHARING is used. It compares the mean haplotype sharing among patients with the mean haplotype sharing among controls by means of a Student s t-test. Since the observed haplotype similarities are not independent, the variance of the haplotype sharing cannot be calculated by means of a closed formula. Hence, a randomization procedure is used: a sample half the size of the observed group is drawn without replacement and haplotype sharing is calculated for this sample. This procedure is repeated, providing a variance for the haplotype sharing under LD. If the above test shows significant results, an additional test for fine mapping is performed, the directional test. This test evaluates haplotype sharing from a chosen locus towards the centromeric and the telomeric side separately in the same way the SHARING test does. The region between the peaks of both curves thus obtained is the region most likely to contain a disease susceptibility gene. 61 Chapter 3

Besides HSS, we also applied one- and two-locus association analyses and the Transmission/Disequilibrium Test (TDT). For the one-locus association analysis the frequencies of the alleles and for the two-locus association analysis two-locus haplotypes are compared by means of chi-square tables. Only those alleles that have an expectation of at least one copy are taken into account. For the TDT analysis, the transmission distortion of each allele versus all others is examined with the TDT proposed by Spielman 37. For further information on the theoretical background, for validation by simulations and by application to real data sets and for comparison of the results of HSS with those of association and TDT analysis we refer to I.M. Nolte et al. 29. Missing data In case of missing or phase unknown alleles, haplotype sharing was averaged over all possibilities weighted by the a priori probabilities. For an unknown allele, the a priori probability is the allele frequency and for a phase unknown allele it is 0.5, so LD information is not used in this procedure Parental vs. spousal controls From the point of view of population stratification, haplotypes of parents that were not transmitted to the affected child are most ideal as control haplotypes. Since these haplotypes were sometimes not available, haplotypes of the spouse of the patient or non-shared haplotypes of sibs were used. We analyzed the data using both control populations separately (not shown), but both analyses showed the same results except for a lower significance, caused by the smaller sample sizes. Therefore, we decided to analyze the two types of controls together. 62 Mapping of a susceptibility gene

3.4 Results Missing data Data set completion levels were high; few unknown alleles were observed (2.7%). Phase could not be determined in 10% of patients and 3.3% of controls. We coped with these missing data as described under Subjects and Methods. Linkage disequilibrium The results of the test for multilocus linkage disequilibrium (LD) are shown in figure 2. The standard deviation of the mean haplotype sharing was calculated by repeating this randomization 1000 times. 1600 1400 1200 patients controls -log10(p value) 1000 800 600 400 200 0 0 5 10 15 20 locus Figure 2: Deviation from linkage equilibrium in patients and controls Deviation from multilocus linkage equilibrium (LE) in patients (solid line) and controls (dotted line), evaluated by means of haplotype sharing. Multilocus LE is simulated by permutation of the alleles over the haplotypes. The results are represented as the log 10 value of the significance of the difference in haplotype sharing between the permuted and the observed haplotypes. The standard deviation of the mean sharing is calculated from repeated random redistribution. 63 Chapter 3

Excess sharing over linkage equilibrium was observed over almost the entire region, indicating the presence of LD. LD was stronger in patients than it was in controls over the entire region with a maximum at marker 7 (-log 10 (p value): 1357, equivalent to a z-value of 79.0). LD was strong in the region between markers 1 to 14 both in patients and in controls, thus making IBD of similar haplotypes very likely. In the region of markers 15 to 19, there was no LD in controls (-log 10 (p value) 0-1.94), and only weak LD in patients (-log 10 (p value) 9.1-57.7). For this region, HSS is therefore expected to be less powerful. In the most telomeric region, LD became stronger again in patients and in controls. Difference in mean sharing between patients and controls The log 10 of the p value for the difference in mean length of haplotype sharing between patients and controls as calculated by a Student s t-test is shown in figure 3. The standard deviation was estimated from 1000 repeated samplings without replacement of 50% of the observed haplotypes for patients and controls separately. The maximum log 10 (p value) of 4.3 was found at D6S1666 (locus 7). -log10(p value) 5 4,5 4 3,5 3 2,5 2 1,5 1 0,5 0-0,5 0 5 10 15 20 25 locus Figure 3: Difference in haplotype sharing between patients and controls Difference in haplotype sharing between patient and control haplotypes as calculated by a t-test, expressed as a log 10 value of the significance. The standard deviation is calculated by repeated sampling without replacement of 50% of the observed haplotypes. 64 Mapping of a susceptibility gene

Directional analysis The results of the directional test are shown in figure 4. They are presented as the log 10 of the p value for the difference between patients and controls in mean length of sharing in either direction. The directional test supported the presence of a susceptibility gene in the interval between markers G511525 and D6S1666 (loci 6 and 7): sharing in the centromeric direction showed a peak at locus 7 with a log 10 (p value) of 3.9 and sharing in the telomeric direction showed a peak at locus 6 with a log 10 (p value) of 4.2. 5 4 3 centromeric telomeric -log10(p value) 2 1 0-1 -2 0 5 10 15 20 25 locus Figure 4: Directional analysis Analysis of difference between patients and controls in mean length of haplotype sharing from each locus in the centromeric (solid line) and in the telomeric (dotted line) direction separately. The results are represented as the log 10 (p value). The standard deviation is calculated by repeated sampling without replacement of 50% of the observed haplotypes. 65 Chapter 3

Clustering of identical haplotypes A color-transformed representation of the haplotypes of patients (n=248, right) and of controls (n=226, left) is shown in figure 5. Haplotypes are shown as vertical lines composed of colored fragments, representing alleles. The resolution is such that individual alleles are visible. Marker alleles were mapped to colors. Different alleles at a locus have different colors. Marker alleles have been recoded to give minimal changes in colors in long haplotypes from one marker locus to the next. Haplotypes were clustered for maximal similarity centered at marker D6S1666. This marker was chosen because the difference in haplotype sharing between patients and controls was maximal at this locus. The same clustering algorithm was applied to patient and control haplotypes. The most striking difference is the much higher frequency of long haplotypes in patients than in controls, in particular those haplotypes that appear controls patients D6S1560 RING3CA D6S2445 TAP1 D6S2444 G511525 D6S1666 D6S273 TNFa D6S265 D6S478 D6S1683 D6S306 D6S464 D6S1558 D6S1621 D6S1281 D6S1545 D6S276 D6S299 D6S1691 D6S461 Figure 5: Color-transformed representation of haplotypes Haplotypes of patients (n=248) and controls (n=226). In the direction of the x-axis, haplotypes are clustered. Patients are in the right half of the figure, indicated by a red bar in front of the haplotypes, controls in the left half, indicated by a blue bar. Each vertical colored line represents one haplotype. On the y-axis, marker loci are indicated. Marker alleles are mapped to colors. Different alleles at a locus have different colors. Choice of colors is thus, that only minimal changes in color appear from one marker locus to the next in long haplotypes. Haplotypes are clustered for maximal similarity at marker D6S1666. 66 Mapping of a susceptibility gene