Estimates for the mutation rates of spoligotypes and VNTR types of Mycobacterium tuberculosis Josephine F. Reyes and Mark M. Tanaka November 30, 2009 1/21
Understanding diversity in bacterial pathogens How did transmission (infection) occur among individuals? What are the relationships among strains infecting different individuals? How do molecular markers change? At what rate do these changes occur? 2/21
Mycobacterium tuberculosis M. tuberculosis, causative bacterium of TB Tuberculosis (TB) disease: kills a person every 15 seconds Growing number of studies on its molecular epidemiology 3/21
Studies on molecular epidemiology of TB (PubMed search results) Number of papers 0 20 40 60 80 IS6110 spoligotyping VNTR 1990 1995 2000 2005 4/21 Publication year
MTB genome VNTR - variable numbers of tandem repeats spoligotypes - spacer oligonucleotide typing 5/21
Molecular markers for TB: VNTRs and spoligotypes IS6110 DNA fingerprint Spoligotype MIRU-VNTR-pattern Resistance pks1/15 Genotype S H REZ Homolka et al, BMC Microbiology 2008 8:103 doi:10.1186/1471-2180-8-103 6 / 21 EAI West African-1 West African-2 Beijing Cameroon Sierra Leone-2 Sierra Leone-1
Spoligoforests and minimum spanning trees Reyes et al. BMC Bioinformatics 2008 9:496 Figure from Mokrousov et al, Infection, Genetics and Evolution 9 (2009) 115 Homolka et al. BMC Microbiology 2008 8:103 7/21
Application: Analysis of homoplasy µ = 0.07 µ = 0.04 µ = 0.01 0 20 40 60 80 110 140 170 Number of genos µ = 0.005 8/21
Mutation rates Mutation rate is the rate at which mutations appear within the M. tuberculosis population in patients and reach fixation (events per case per year) Mutation plays a large role in diversity of molecular markers Utility of molecular markers in epidemiology are enhanced by knowledge of mutation rates (and factors affecting these rates), mutation mechanisms 9/21
Table: Mutation rate of a VNTR locus per year Organism Reference Point estimate No. of loci used M. tuberculosis Grant et al 2008 10 5 12 M. tuberculosis Wirth et al 2008 10 3.9 24 (5) Y. pestis Vogler et al 2007 0.098 14 E. coli Vogler et al 2006 0.626 25 Table: Mutation rate of a spoligotype per year Reference Point estimate (S.E.) (spol-l09-c) Luciani et al 2009 (Cuba) 0.0133 0.009000 (spol-l09-v) Luciani et al 2009 (Venez) 0.0029 0.001975 10 / 21
Methods Diversity parameter involving µ θ =2N e µ Point estimate for the mutation rate of a molecular marker (MM) ˆµ MM = = ˆθ MM ˆθ reference ˆµ reference ˆµ MM ˆµ reference ˆµ reference 11 / 21
Reference estimates for the mutation rate of IS6110 types (used as ˆµ reference ) Reference Point estimate S.E. (IS-TR01) Tanaka & Rosenberg 2001 0.1390 0.02308 (IS-R03) Rosenberg et al 2003 0.2870 0.10300 (IS-dB99) de Boer et al 1999 0.2166 0.04786 (IS-W02) Warren et al 2002 0.0793 0.00649 (IS-L09-C) Luciani et al 2009 (Cuba) 0.0685 (0.03545) (IS-L09-E) Luciani et al 2009 (Estonia) 0.0882 (0.04240) 12 / 21
Methods (continued) Collect data sets with isolates typed using either: IS6110 and VNTR typing spoligotyping and VNTR typing IS6110 and spoligotyping Estimate θ for each data set of sample size n and number of molecular marker types (genotypes) g using g = n 1 i=0 θ θ + i 13 / 21
Values of ˆθ from available data sets Among these data 5000 IS6110 RFLP sets, 25 are typed Spoligotyping VNTR typing 500 by both IS6110 and spoligotyping, 19 50 by both VNTR and IS6110, and 27 by 5 both VNTR and spoligotyping.!-estimate 0 0 20 40 60 80 100 Rank of!-estimate 14 / 21
Methods (continued) ˆµ MM = ˆθ MM ˆθ reference ˆµ reference Compute m = ˆθ MM ˆθ reference Estimate standard error for m (by bootstrapping) Find a confidence interval around ˆµ MM using p(u) = g(z)h(u/z) 1 z dz Ω where z = m, u/z =ˆµ MM /m with densities g and h, andω is the support for z Glen et al Computational Statistics & Data Analysis 44 (2004) 451 464 15 / 21
Results: relative mutation rates 0 500 1000 2000 0.04 0.06 0.08 0.10 m vntr IS 0 500 1500 2500 0.15 0.20 0.25 0.30 0.35 0.40 m vntr spol 0 500 1000 1500 0.2 0.3 0.4 0.5 m spol IS 16 / 21
Results: new estimates for the mutation rate of a VNTR locus (per year) 10!2 10!3 10!4 10!5 17 / 21 spol! L09! V spol! L09! C IS! L09! C IS! W02 IS! L09! E IS! TR01 IS! db99 IS! RTT03
Results: new estimates for the mutation rate of a spoligotype (per year) 10!1 10!2 10!3 18 / 21 IS! L09! C IS! W02 IS! L09! E IS! TR01 IS! db99 IS! RTT03
Summary The following issues should be addressed when studying the mutation rate of molecular markers for pathogenic bacteria epidemiologically-linked isolates accounting for the abundance/frequency of isolates We found evidence for higher mutation rate estimates of a VNTR locus (at least 10-fold higher) Mutation rate estimates for spoligotypes overlap with previously reported rates 19 / 21
Future directions Use approximate Bayesian computation to find the mutation rate of a VNTR locus Compute mutation rates of other genetic markers in other bacterial organisms 20 / 21
Some thank yous Richard Zach Aandahl (UNSW School of Maths and Stats) Ruiting Lan (UNSW BABS) Andrew Francis (University of Western Sydney, School of Computing and Maths) Fabio Luciani (UNSW School of Medical Sciences) UNSW University International Postgraduate Award (UIPA) 21 / 21