LTA Analysis of HapMap Genotype Data

Size: px
Start display at page:

Download "LTA Analysis of HapMap Genotype Data"

Transcription

1 LTA Analysis of HapMap Genotype Data Introduction. This supplement to Global variation in copy number in the human genome, by Redon et al., describes the details of the LTA analysis used to screen HapMap CNV calls for somatic artifacts on the basis of the HapMap genotype data (as presented in Supplementary Table 5). Additional analyses of the HapMap data based on the LTA approach are also described in this supplement. It is generally accepted that an analysis of high-quality SNP genotype data from related individuals can yield information about the location of segregating germline deletions. Recently, methods have been developed that exploit unusual features of such data (i.e., Mendelian inheritance errors (MIs), departures from Hardy-Weinberg equilibrium, patterns of uncalled genotypes) and successfully applied to the high-density genomic SNP data generated by the HapMap project to detect experimentally validated deletions (Conrad et al. 2006; McCarrol et al. 2006). In principle, it should be possible to follow a similar tactic to identify deletions that occur in the soma, for example in cell lines generated from somatic tissue. We set out to explore the feasibility of such an approach, develop a method (if possible), and apply it to the Consortium CNV data to detect potential cell-line artifacts. The data used for these analyses are the Consortium data (both 500K EA and WGTP calls) and the Phase I HapMap data (release 20, for all autosomes (International HapMap Consortium 2005). We make limited use of the Phase II data when trying to establish which CNV calls made in the present study are likely to be somatic. Filtering of the HapMap data was done as described in (Conrad et al. 2006). Outline of the LTA Method. The pattern that we wish to exploit will come from a deletion occurring in the cell line of a previously copy-number normal trio member, at an autosomal locus where the other two trio members have normal (n=2) copy number. Specifically, we are trying to find clusters of SNPs at which an allele transmitted from parent-to-child has subsequently been deleted in the parent. We call this approach Loss

2 of Transmitted Allele" analysis or LTA for short. This method will not work if the cell line artifact is a duplication, or if the deletion occurs within a person who was only haploid for the segment to begin with. The method is based on the prediction that a SNP genotyping experiment will call a deletion hemizygote as homozygous for the allele that is present. When an allele that is present only once in all four parental gametes is deleted after transmission to a child, that information is erased and it appears that the child has inherited a de novo mutation. SNPs typed across a cell line deletion would hence be enriched for Mendelian inheritance (MI) errors (e.g., LTA Figure 1.) LTA Figure 1: Example of trio genotype configurations used in this analysis. Proof of Principle. We extended the model described in (Conrad et al. 2006) to include genotype configurations that were informative of a somatic deletion, leading to 8 general trio configurations. To gain a better understanding of the patterns of trio genotypes generated by a somatic deletion, we scored the Phase I HapMap SNP genotypes underlying a known cell-line deletion in NA07055 (del(2q23-q24), greater than 15Mb long), which are presented in Table 1. The run of Type II MIs in the Phase I data caused by this artifact is apparent upon visual inspection (LTA Figure 2).

3 Trio Class Description Count 1 Type I MI, deletion in mother (germline or somatic) 54 2 Type I MI, deletion in father (germline or somatic) 0 3 Type I MI, deletion in either parent (somatic only) 28 4 Uninformative Compatible with somatic deletion in mother Compatible with somatic deletion in father Incompatible with somatic deletion in either parent Compatible with somatice deletion in either parent 532 LTA Table 1: Distribution of trio genotype configurations within known cell line artifact LTA Figure 2: Pattern of Type II MIs created by somatic deletion. The physical position of each Type II MI in the Phase I HapMap data from chromosome 2 is plotted against physical position (NCBI34) for all 30 CEU families. The run of MIs caused by a somatic event starting near 150Mb is quite apparent. Data are from release 16c.1. Several important points stand out. First, there is a strong, specific signal in the pattern of MIs from the family of this individual. Individual NA07055 is the mother of the trio. All MIs are either Type II or Type I but from the mother only. The rate of MIs in this stretch is 0.01, a rate twenty-five times above the genome average. Second, within the class of somatic deletion compatible trio configurations, configurations specific to maternal deletions outnumber paternal configurations by an excess of almost 3:1. However, this

4 indicates that there is some noise in the data and our modeling assumptions do breakdown at points. This phenomenon has been noted in prior work, and can, for instance, lead to the erroneous splitting of a single deletion-compatible stretch into multiple smaller stretches. Screening HapMap data for artifacts. Two major questions can be addressed with this modeling approach. First, how many CNVs called in the present study are not true germline deletions but artifacts of the cell culture process? Second, we may want to ask how many cell line events have occurred within the HapMap as a whole. 500K EA and WGTP HapMap data. Consortium Data. Screening the HapMap CNV calls essentially amounts to a problem of multiple testing, where for each CNV we will test the null hypothesis, This is a germline CNV" based on the trio genotypes at the underlying N HapMap SNP genotypes. We assume that number of Type II MIs in a set of genotypes from a trio is a binomial random variable, and that the genotyping experiment on a given SNP is independent of experiments at all other SNPs (an assumption not likely to hold for some Perlegen SNPs and instances of pooling). We estimate θ c, the rate of Type II MIs, across the entire HapMap, for each combination of genotyping platform and genotyping center, a total of 11 rates. If we see x Type II MIs underlying a called CNV, we calculate the probability that the run of MIs is due to random genotyping error as the probability of observing x or greater Type II MIs in a stretch of N trio genotypes, θ here is taken to be

5 the arithmetic mean of the expected Type II rate for all SNPs within the CNV. 500K EA and WGTP CNVs are assessed separately, as are CNVs in each (CEU, YRI) population. Results. The data for this analysis was all preliminary deletion calls in the CEU and YRI parents (3252 from the WGTP, 1506 from 500K EA). The method for deciding significant regions was a bit convoluted, but with the ultimate goal of creating a set of conservative (leaning towards over-calling) somatic artifact CNVs. First, all CNV calls with no Type II MIs were removed from the list (2748/3252, WGTP; 1392/ k EA). Notably, only 3.2% (153/4758) of all calls overlapped 2 or more Type II MIs. A p-value was then calculated for each CNV as described above. During this analysis, it became clear that Type II MIs were sometimes created when two hemizygous parents (called as a SNP homozygote) give rise to a homozygous null child (erroneously called a heterozygote). Our explanation of this phenomenon is that at high deletion frequencies (perhaps > 40%), SNP genotyping algorithms often cluster CNV status instead of SNP genotype status. To avoid such false-positives, we removed CNVs from regions with greater than 2 deletion" CNVs segregating in the same population and detected by the same platform. Other, similar, frequency thresholds were tested but did not substantially change the number of artifact calls (data not shown). We note that identifying somatic variants at loci with common germline variation is a challenging problem for any approach using only cell line DNA (see Caveats section). CNVs from the remaining set were deemed significant if they had a significant Bonferroni-corrected p-value, corrected against the number of CNVs left within the same platform-population group (e.g., CEU- WGTP CNVs). In total, we made 16 calls, with uncorrected p-values ranging from to 3.3 x (presented in Supplementary Table 5). In a separate analysis of offspring genotypes, one additional CNV was called as a cell-line artifact on the basis of the SNP failure pattern. Strictly speaking it is not possible to distinguish a de novo mutation from a somatic event in this case, without additional biological material from the donor (or perhaps DNA from such a person s offspring).

6 Power. The power to detect somatic events in the HapMap data is a function of the number of SNPs that are typed within the event and the allele frequencies at those SNPs. One possible limitation to the power of our analysis is the relationship between SNP density and the location of CNVs. Analysis by our group and others suggest that SNP density is substantially lower near CNV regions. To assess the impact of SNP density we conducted a power simulation using the complete set of 4758 parental deletions. For each CNV locus, we simulated a somatic event in each parent of each of the other 29 families, and recorded the number of times the event would have been detected using the same thresholds of significance as our initial screen (YRI, p < 0.003; CEU, p < ). As expected, the mean power to detect WGTP somatic CNVs (YRI, 0.61; CEU, 0.48) was much greater than the power to detect somatic CNVs identified with the 500K EA (YRI 0.23, CEU, 0.15). Results are shown in LTA Figure 3. The WGTP numbers are likely to be over-inflated somewhat, as the size of each event detected on this platform will be over-estimated on average. Although power is low in many regions, the total number of artifacts called is not likely to be off by more than a factor of 5 based on this analysis. LTA Figure 3: Power to detect artifacts in consortium data. The relationship between CNV size and power to reject the hypothesis that the CNV is a germline event using Phase II HapMap data. Red points indicate CNVs detected with the WGTP; blue 500k EA.

7 HapMap Full-Genome Screen. There has been some speculation that the CEU cell lines, which were collected many years ago, may have undergone more somatic rearrangement than the cell lines collected for the other HapMap samples. The total rate of Type II MI in the Phase I HapMap CEU genotype data, 3.9 x 10-4 is slightly lower than the rate in YRI 5.0 x10-4. This is in contrast to the rate of Type I MIs, which should be created by both somatic and germline deletions, where the rate in YRI is twice the rate in CEU (1.1 x 10-3 and 5.3 x10-4, respectively). As a second approach to identifying cell-line deletions, we disregard the location of CNVs detected by the Consortium and simply sift through the Phase I HapMap data using a sliding window approach. The choice of which window size to use involves a trade off between resolution and power; after running power analysis with different window sizes, we settled on a 100-SNP window (LTA Figure 4). The median size of the 100-SNP window (264kb) falls in the middle of the range of CNV events detected by the consortium (mean over platforms 249kb, median 165kb). The median power is higher in YRI (0.61) than CEU (0.5). For each family, we split the genome into 100-SNP non-overlapping windows and calculate the probability that all Type II MIs within that window are due to random error. The resulting p-values from CEU and YRI were ranked separately and thresholds of significance were determined that control the false discovery rate (FDR) at 0.05 using the method of (Benjamini 1995). Based on this analysis, we retained 65 windows with p < from CEU and 34 windows with p < in YRI.

8 LTA Figure 4: Power of genome screen. Histograms of estimated power with each 100-SNP window within all CEU and YRI families, estimated using the Phase I HapMap data. The results of this analysis are presented in LTA Tables 2 and 3. (These results are from a different analysis than the results presented in Supplementary Table 5). In these tables, we highlight the correspondence between unusual regions identified in the genome-wide screen, preliminary CNV calls removed on the basis of the Phase II data, and genomic regions where no CNVs were detected by either platform. Although we average the expected rate of Type II MIs across the entire window, it is possible that a deletion substantially smaller than the window size can lead to a significant test. By defining the positions of the outermost Type II MIs in a window as the breakpoints of a putative deletion, the median length of CEU somatic events is 78.6kb, with YRI events slightly larger at 86kb; the median number of Type II MIs involved in an event was 3 in each population. Interestingly, 6/65 CEU windows overlap Immunoglobulin loci, while 8/65 overlap the del(2q23-q24) artifact detected in NA07055 (described above). Notably, 4/6 of the most significant somatic artifacts called in the preliminary CNV are detected in this screen. The two that are not detected are on chromosomes 15 and 19; the center responsible for typing the bulk of Phase I SNPs for these chromosomes

9 filtered Type II MIs from their data prior to release. Presumably, a next-generation genome screen using Phase II data will pick up these unusual features. Unless otherwise noted, these unusual regions from the genome-wide screen overlap loci of (often considerable) CNV polymorphism detected in the present study. Although we suspect that many of the significant windows in these regions may simply be an artifact of systematic genotyping error in the presence of high-frequency copy number variation (see Caveats section), there remains the possibility that some may represent true somatic events. Other methods will be required to unravel such complexity.

10 LTA Table 2: List of 34 regions of unusual Type II MI clusters in YRI Phase I HapMap data. Expected Type II MI rate is the expected rate of Type II MIs across the region, given a model of random genotype error and conditional on the platform and center used to type the SNPs. The number of Type I and Type II MIs are broken down for each cluster. Removed from prelim : corresponds to CNV removed from preliminary CNV calls before downstream analysis; no CNV call : no CNV was called on either platform in any population at this locus. Expected Type I Type I Type II MI MI MI Type Child ID Mother ID Father ID Chr Start Stop P-value Rate mother father II MI Comments NA19132 NA19131 NA E NA19211 NA19209 NA E NA19173 NA19172 NA E NA19142 NA19140 NA E no CNV call NA18863 NA18861 NA E NA19142 NA19140 NA E NA19100 NA19099 NA E NA18863 NA18861 NA E E NA19205 NA19204 NA E E Removed from prelim NA19205 NA19204 NA E Removed from prelim NA19240 NA19238 NA E E no CNV call NA18860 NA18858 NA E NA19145 NA19143 NA E NA18863 NA18861 NA E NA19194 NA19193 NA E E no CNV call NA18872 NA18870 NA E

11 Expected Type I Type I Type II MI MI MI Type Child ID Mother ID Father ID Chr Start Stop P-value Rate mother father II MI Comments NA19103 NA19102 NA E no CNV call NA19161 NA19159 NA E NA18863 NA18861 NA E NA18863 NA18861 NA E NA18863 NA18861 NA E NA18863 NA18861 NA E NA18854 NA18852 NA E NA19120 NA19116 NA E E no CNV call NA19100 NA19099 NA E E no CNV call NA19205 NA19204 NA E no CNV call NA18506 NA18508 NA E no CNV call NA19132 NA19131 NA E no CNV call NA19103 NA19102 NA E NA19221 NA19222 NA E no CNV call NA18857 NA18855 NA E no CNV call NA19173 NA19172 NA E E IgH NA19211 NA19209 NA E no CNV call NA19154 NA19152 NA E E IgL

12 LTA Table 3: List of 65 regions of unusual Type II MI clusters in CEU Phase I HapMap data. Expected Type II MI rate is the expected rate of Type II MIs across the region, given a model of random genotype error and conditional on the platform and center used to type the SNPs. The number of Type I and Type II MIs are broken down for each cluster. Removed from prelim : corresponds to CNV removed from preliminary CNV calls before downstream analysis; no CNV call : no CNV was called on either platform in any population at this locus. Expected Type I Type I Type II MI MI MI Type Child ID Mother ID Father ID Chr Start Stop P-value Rate mother father II MI Comments NA12753 NA12763 NA E NA07029 NA07000 NA E no CNV call NA10839 NA12006 NA E NA12801 NA12813 NA E NA10857 NA12044 NA E NA12753 NA12763 NA E no CNV call NA12707 NA12717 NA E NA12707 NA12717 NA E no CNV call NA07348 NA07345 NA E NA06991 NA06985 NA E NA07029 NA07000 NA E no CNV call NA12752 NA12761 NA E no CNV call NA10839 NA12006 NA E NA10835 NA12249 NA E E NA10839 NA12006 NA E E NA07048 NA07055 NA E E q23.1-q24.3 deletion NA07048 NA07055 NA E E q23.1-q24.3 deletion NA07048 NA07055 NA E E q23.1-q24.3 deletion NA07048 NA07055 NA E E q23.1-q24.3 deletion

13 Expected Type I Type I Type II MI MI MI Type Child ID Mother ID Father ID Chr Start Stop P-value Rate mother father II MI Comments NA07048 NA07055 NA E E q23.1-q24.3 deletion NA07048 NA07055 NA E E q23.1-q24.3 deletion NA07048 NA07055 NA E E q23.1-q24.3 deletion NA07048 NA07055 NA E E q23.1-q24.3 deletion NA12753 NA12763 NA E NA12865 NA12875 NA E E no CNV call NA07348 NA07345 NA E Removed from prelim NA07348 NA07345 NA E Removed from prelim NA12864 NA12873 NA E no CNV call NA12865 NA12875 NA E NA12801 NA12813 NA no CNV call NA07019 NA07056 NA no CNV call NA10846 NA12145 NA E NA12740 NA12751 NA E NA12802 NA12815 NA E NA07048 NA07055 NA E Removed from prelim NA10854 NA11840 NA E NA10860 NA11993 NA E no CNV call NA10839 NA12006 NA E NA12740 NA12751 NA E E NA12753 NA12763 NA E E no CNV call NA10851 NA12057 NA E E no CNV call NA12752 NA12761 NA E E no CNV call NA07019 NA07056 NA E E no CNV call NA07348 NA07345 NA E E Removed from prelim NA12752 NA12761 NA E E no CNV call NA12740 NA12751 NA E E no CNV call

14 Expected Type I Type I Type II MI MI MI Type Child ID Mother ID Father ID Chr Start Stop P-value Rate mother father II MI Comments NA12753 NA12763 NA E no CNV call NA10847 NA12239 NA E NA12801 NA12813 NA E no CNV call NA10857 NA12044 NA E no CNV call NA10846 NA12145 NA E no CNV call NA10846 NA12145 NA no CNV call NA12752 NA12761 NA E no CNV call NA10839 NA12006 NA E NA10830 NA12236 NA E E IgH NA10860 NA11993 NA E NA10855 NA11832 NA E E no CNV call NA12864 NA12873 NA E NA12878 NA12892 NA E no CNV call NA12864 NA12873 NA E E IgL NA07029 NA07000 NA E IgL NA12864 NA12873 NA E IgL NA10860 NA11993 NA E IgL NA12878 NA12892 NA IgL NA12753 NA12763 NA E

15 Proportion of Type II MIs due to somatic deletion. Based on these results, it appears that only a small fraction of any cell-line genome has undergone somatic rearrangement. Although estimating extremely small proportions can be difficult, the abundance of SNP data gave us the confidence to attempt to estimate what fraction of the genome has experienced rearrangement in the typical HapMap individual. We formulate the problem as a mixture model. In this case, the total number of Type II MIs in each population is due to a mixture of 2 processes, somatic rearrangement and genotyping error. If we have good estimates of the rates at which Type II MIs occur under each of these processes (θ 1 and θ 2 ), we may be able to estimate the extent to which each process contributes to our total data. Our data is the count of Type II MIs within each of N 100-SNP windows across all 30 families, tabulated separately for the CEU and YRI populations using the Phase I data. There were 315, SNP windows in CEU, 306,912 in YRI. The number of Type II MIs within each 100- SNP window is modeled as a binomial random variable k. The likelihood function for the mixture parameter π is The Type II MI rate due to genotyping error, θ 2, and the Type II MI rate due to somatic deletions, θ 1, are estimated as follows. θ 2 is an 11-element vector, created by tabulating the frequency of Type II MI for each combination of center/platform. To estimate θ 1, we simulate cell line deletions spanning each 100-SNP window and record the proportion of Type II MIs, averaging across all windows and all families. The object of our inference is the true value of π, the proportion of 100-SNP windows overlapping regions of somatic rearrangement. After a first-pass analysis over the entire range of π, the likelihood function was evaluated over a grid of 100 equally spaced points from 0.9 to 1.0; the results are shown in LTA Figure 5. It is interesting to note that despite a lower Type II error rate in CEU, the results of this analysis suggest that a greater percentage of CEU genomes are contained in artifacts when compared to YRI genomes.

16 LTA Figure 5: Mixture model analysis. Plot of the log L(π) evaluated over a grid of values; the maximum likelihood estimate for π is indicated with a vertical line for CEU ( , red) and YRI ( , blue). The scale of the y-axis shows the value of the likelihood function evaluated for CEU data; YRI values shown are L+10,000 for display purposes. Caveats. The results of this work and previous work suggest that large somatic deletions in the trio parents of the HapMap data should be easy to detect using patterns of SNP failures, when the deletion occurs in region of the genome that harbors little or no population copy number variation. It will be a much more difficult problem to detect cell line artifacts at regions where there is already substantial germline copy number variation. Unfortunately, such regions may be the most inclined to experience somatic rearrangement if CNV frequency is related to the underlying mutation rate (Lam and Jeffreys 2006). At high deletion frequencies (perhaps > 40%), SNP genotyping algorithms sometimes cluster CNV status instead of SNP genotype status. During this analysis, it became clear that Type II MIs were sometimes created when two hemizygous parents (called as SNP homozygotes) give rise to a homozygous null child (erroneously called a SNP heterozygote). As mentioned in the introduction, there are other phenomena that could create Type II MIs that aren't captured in our simple model. A homozygous tract of > 100Mb on 1q

17 on a CEU individual was first thought to be a large cell line deletion by (Conrad et al. 2006), but subsequent analysis revealed it to be a case of uniparental isodisomy. Such events could occur by mitotic recombination, and would produce a pattern of SNP failures identical to a somatic deletion. In cases where large somatic deletions are predicted on the basis of SNP data but are not detected in the underlying intensity data, UPD is one possible explanation. Another possible explanation for discordant results between the SNP-based method and the array-based methods used in the current paper is the use of different lots of cells, at different points of time. Cell lines that are mosaic for artifacts should also behave unpredictably when typed on various platforms at various points of time. Finally, several sample mix-ups were identified from unusual clusters of MIs in an earlier analysis of the Phase I data (Conrad et al. 2006). Unresolved sample mix-ups could continue to contribute to the signal we are detecting here. References Benjamini, Y., and Hochberg, Yosef Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Journal of the Royal Statistical Society 57: Conrad, D.F., T.D. Andrews, N.P. Carter, M.E. Hurles, and J.K. Pritchard A high-resolution survey of deletion polymorphism in the human genome. Nat Genet 38: International HapMap Consortium A haplotype map of the human genome. Nature 437: Lam, K.W. and A.J. Jeffreys Processes of copy-number change in human DNA: the dynamics of {alpha}-globin gene deletion. Proc Natl Acad Sci U S A 103: McCarrol, S.A., T.N. Hadnott, G.H. Perry, P.C. Sabeti, M.C. Zody, J.C. Barrett, S. Dallaire, S.B. Gabriel, C. Lee, M.J. Daly, and D.M. Altshuler Common deletion polymorphisms in the human genome. Nat Genet 38:

Global variation in copy number in the human genome

Global variation in copy number in the human genome Global variation in copy number in the human genome Redon et. al. Nature 444:444-454 (2006) 12.03.2007 Tarmo Puurand Study 270 individuals (HapMap collection) Affymetrix 500K Whole Genome TilePath (WGTP)

More information

Supplementary note: Comparison of deletion variants identified in this study and four earlier studies

Supplementary note: Comparison of deletion variants identified in this study and four earlier studies Supplementary note: Comparison of deletion variants identified in this study and four earlier studies Here we compare the results of this study to potentially overlapping results from four earlier studies

More information

GENETIC LINKAGE ANALYSIS

GENETIC LINKAGE ANALYSIS Atlas of Genetics and Cytogenetics in Oncology and Haematology GENETIC LINKAGE ANALYSIS * I- Recombination fraction II- Definition of the "lod score" of a family III- Test for linkage IV- Estimation of

More information

CHROMOSOMAL MICROARRAY (CGH+SNP)

CHROMOSOMAL MICROARRAY (CGH+SNP) Chromosome imbalances are a significant cause of developmental delay, mental retardation, autism spectrum disorders, dysmorphic features and/or birth defects. The imbalance of genetic material may be due

More information

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0

The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 Introduction Loss of erozygosity (LOH) represents the loss of allelic differences. The SNP markers on the SNP Array 6.0 can be used

More information

Tutorial on Genome-Wide Association Studies

Tutorial on Genome-Wide Association Studies Tutorial on Genome-Wide Association Studies Assistant Professor Institute for Computational Biology Department of Epidemiology and Biostatistics Case Western Reserve University Acknowledgements Dana Crawford

More information

Nature Genetics: doi: /ng Supplementary Figure 1

Nature Genetics: doi: /ng Supplementary Figure 1 Supplementary Figure 1 Illustrative example of ptdt using height The expected value of a child s polygenic risk score (PRS) for a trait is the average of maternal and paternal PRS values. For example,

More information

Integrated detection and population-genetic analysis. of SNPs and copy number variation

Integrated detection and population-genetic analysis. of SNPs and copy number variation Integrated detection and population-genetic analysis of SNPs and copy number variation Steven A. McCarroll 1,2,*, Finny G. Kuruvilla 1,2,*, Joshua M. Korn 1,SimonCawley 3, James Nemesh 1, Alec Wysoker

More information

November 9, Johns Hopkins School of Medicine, Baltimore, MD,

November 9, Johns Hopkins School of Medicine, Baltimore, MD, Fast detection of de-novo copy number variants from case-parent SNP arrays identifies a deletion on chromosome 7p14.1 associated with non-syndromic isolated cleft lip/palate Samuel G. Younkin 1, Robert

More information

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis

BST227 Introduction to Statistical Genetics. Lecture 4: Introduction to linkage and association analysis BST227 Introduction to Statistical Genetics Lecture 4: Introduction to linkage and association analysis 1 Housekeeping Homework #1 due today Homework #2 posted (due Monday) Lab at 5:30PM today (FXB G13)

More information

Genomic structural variation

Genomic structural variation Genomic structural variation Mario Cáceres The new genomic variation DNA sequence differs across individuals much more than researchers had suspected through structural changes A huge amount of structural

More information

Whole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis

Whole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis HMG Advance Access published December 21, 2012 Human Molecular Genetics, 2012 1 13 doi:10.1093/hmg/dds512 Whole-genome detection of disease-associated deletions or excess homozygosity in a case control

More information

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK

DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK CHAPTER 6 DOES THE BRCAX GENE EXIST? FUTURE OUTLOOK Genetic research aimed at the identification of new breast cancer susceptibility genes is at an interesting crossroad. On the one hand, the existence

More information

Identification of regions with common copy-number variations using SNP array

Identification of regions with common copy-number variations using SNP array Identification of regions with common copy-number variations using SNP array Agus Salim Epidemiology and Public Health National University of Singapore Copy Number Variation (CNV) Copy number alteration

More information

Introduction to the Genetics of Complex Disease

Introduction to the Genetics of Complex Disease Introduction to the Genetics of Complex Disease Jeremiah M. Scharf, MD, PhD Departments of Neurology, Psychiatry and Center for Human Genetic Research Massachusetts General Hospital Breakthroughs in Genome

More information

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes.

Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes. Supplementary Figure 1 Relationship between genomic features and distributions of RS1 and RS3 rearrangements in breast cancer genomes. (a,b) Values of coefficients associated with genomic features, separately

More information

Lecture 17: Human Genetics. I. Types of Genetic Disorders. A. Single gene disorders

Lecture 17: Human Genetics. I. Types of Genetic Disorders. A. Single gene disorders Lecture 17: Human Genetics I. Types of Genetic Disorders A. Single gene disorders B. Multifactorial traits 1. Mutant alleles at several loci acting in concert C. Chromosomal abnormalities 1. Physical changes

More information

Pedigree Analysis Why do Pedigrees? Goals of Pedigree Analysis Basic Symbols More Symbols Y-Linked Inheritance

Pedigree Analysis Why do Pedigrees? Goals of Pedigree Analysis Basic Symbols More Symbols Y-Linked Inheritance Pedigree Analysis Why do Pedigrees? Punnett squares and chi-square tests work well for organisms that have large numbers of offspring and controlled mating, but humans are quite different: Small families.

More information

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder

Introduction to linkage and family based designs to study the genetic epidemiology of complex traits. Harold Snieder Introduction to linkage and family based designs to study the genetic epidemiology of complex traits Harold Snieder Overview of presentation Designs: population vs. family based Mendelian vs. complex diseases/traits

More information

Single Gene (Monogenic) Disorders. Mendelian Inheritance: Definitions. Mendelian Inheritance: Definitions

Single Gene (Monogenic) Disorders. Mendelian Inheritance: Definitions. Mendelian Inheritance: Definitions Single Gene (Monogenic) Disorders Mendelian Inheritance: Definitions A genetic locus is a specific position or location on a chromosome. Frequently, locus is used to refer to a specific gene. Alleles are

More information

Genetics Review. Alleles. The Punnett Square. Genotype and Phenotype. Codominance. Incomplete Dominance

Genetics Review. Alleles. The Punnett Square. Genotype and Phenotype. Codominance. Incomplete Dominance Genetics Review Alleles These two different versions of gene A create a condition known as heterozygous. Only the dominant allele (A) will be expressed. When both chromosomes have identical copies of the

More information

(b) What is the allele frequency of the b allele in the new merged population on the island?

(b) What is the allele frequency of the b allele in the new merged population on the island? 2005 7.03 Problem Set 6 KEY Due before 5 PM on WEDNESDAY, November 23, 2005. Turn answers in to the box outside of 68-120. PLEASE WRITE YOUR ANSWERS ON THIS PRINTOUT. 1. Two populations (Population One

More information

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data.

Nature Genetics: doi: /ng Supplementary Figure 1. PCA for ancestry in SNV data. Supplementary Figure 1 PCA for ancestry in SNV data. (a) EIGENSTRAT principal-component analysis (PCA) of SNV genotype data on all samples. (b) PCA of only proband SNV genotype data. (c) PCA of SNV genotype

More information

Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies

Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Stanford Biostatistics Workshop Pierre Neuvial with Henrik Bengtsson and Terry Speed Department of Statistics, UC Berkeley

More information

Dan Koller, Ph.D. Medical and Molecular Genetics

Dan Koller, Ph.D. Medical and Molecular Genetics Design of Genetic Studies Dan Koller, Ph.D. Research Assistant Professor Medical and Molecular Genetics Genetics and Medicine Over the past decade, advances from genetics have permeated medicine Identification

More information

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics

Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics Copy Number Variations and Association Mapping 02-715 Advanced Topics in Computa8onal Genomics SNP and CNV Genotyping SNP genotyping assumes two copy numbers at each locus (i.e., no CNVs) CNV genotyping

More information

ADVANCED PGT SERVICES

ADVANCED PGT SERVICES Genomic Prediction ADVANCED PGT SERVICES with PGT-A using SEQ is a cost-effective, rigorously validated, unambiguous, and streamlined test for aneuploidy in blastocyst biopsies, and uses state of the art

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#4:(October-0-4-2010) Cancer and Signals 1 2 1 2 Evidence in Favor Somatic mutations, Aneuploidy, Copy-number changes and LOH

More information

Introduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016

Introduction to genetic variation. He Zhang Bioinformatics Core Facility 6/22/2016 Introduction to genetic variation He Zhang Bioinformatics Core Facility 6/22/2016 Outline Basic concepts of genetic variation Genetic variation in human populations Variation and genetic disorders Databases

More information

Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing

Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing www.sciencemag.org/cgi/content/full/science.1186802/dc1 Supporting Online Material for Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing Jared C. Roach, Gustavo Glusman, Arian

More information

Nature Biotechnology: doi: /nbt.1904

Nature Biotechnology: doi: /nbt.1904 Supplementary Information Comparison between assembly-based SV calls and array CGH results Genome-wide array assessment of copy number changes, such as array comparative genomic hybridization (acgh), is

More information

Using GWAS Data to Identify Copy Number Variants Contributing to Common Complex Diseases

Using GWAS Data to Identify Copy Number Variants Contributing to Common Complex Diseases arxiv:1010.5040v1 [stat.me] 25 Oct 2010 Statistical Science 2009, Vol. 24, No. 4, 530 546 DOI: 10.1214/09-STS304 c Institute of Mathematical Statistics, 2009 Using GWAS Data to Identify Copy Number Variants

More information

A. Incorrect! Cells contain the units of genetic they are not the unit of heredity.

A. Incorrect! Cells contain the units of genetic they are not the unit of heredity. MCAT Biology Problem Drill PS07: Mendelian Genetics Question No. 1 of 10 Question 1. The smallest unit of heredity is. Question #01 (A) Cell (B) Gene (C) Chromosome (D) Allele Cells contain the units of

More information

Interactive analysis and quality assessment of single-cell copy-number variations

Interactive analysis and quality assessment of single-cell copy-number variations Interactive analysis and quality assessment of single-cell copy-number variations Tyler Garvin, Robert Aboukhalil, Jude Kendall, Timour Baslan, Gurinder S. Atwal, James Hicks, Michael Wigler, Michael C.

More information

Integrated detection and population-genetic analysis of SNPs and copy number variation

Integrated detection and population-genetic analysis of SNPs and copy number variation 8 Nature Publishing Group http://www.nature.com/naturegenetics Integrated detection and population-genetic analysis of SNPs and copy number variation Steven A McCarroll 4,, Finny G Kuruvilla 4,, Joshua

More information

Integrated detection and population-genetic analysis of SNPs and copy number variation

Integrated detection and population-genetic analysis of SNPs and copy number variation Integrated detection and population-genetic analysis of SNPs and copy number variation Steven A McCarroll 4,, Finny G Kuruvilla 4,, Joshua M Korn 6, Simon Cawley 7, James Nemesh, Alec Wysoker, Michael

More information

Structural Variation and Medical Genomics

Structural Variation and Medical Genomics Structural Variation and Medical Genomics Andrew King Department of Biomedical Informatics July 8, 2014 You already know about small scale genetic mutations Single nucleotide polymorphism (SNPs) Deletions,

More information

Agilent s Copy Number Variation (CNV) Portfolio

Agilent s Copy Number Variation (CNV) Portfolio Technical Overview Agilent s Copy Number Variation (CNV) Portfolio Abstract Copy Number Variation (CNV) is now recognized as a prevalent form of structural variation in the genome contributing to human

More information

0.1% variance attributed to scattered single base-pair changes SNPs

0.1% variance attributed to scattered single base-pair changes SNPs April 2003, human genome project completed: 99.9% of genome identical in all humans 0.1% variance attributed to scattered single base-pair changes SNPs It has been long recognized that variation in the

More information

Problem 3: Simulated Rheumatoid Arthritis Data

Problem 3: Simulated Rheumatoid Arthritis Data Problem 3: Simulated Rheumatoid Arthritis Data Michael B Miller Michael Li Gregg Lind Soon-Young Jang The plan

More information

GENOME-WIDE ASSOCIATION STUDIES

GENOME-WIDE ASSOCIATION STUDIES GENOME-WIDE ASSOCIATION STUDIES SUCCESSES AND PITFALLS IBT 2012 Human Genetics & Molecular Medicine Zané Lombard IDENTIFYING DISEASE GENES??? Nature, 15 Feb 2001 Science, 16 Feb 2001 IDENTIFYING DISEASE

More information

Global variation in copy number in the human genome

Global variation in copy number in the human genome Vol 3 November doi:.38/nature39 Global variation in copy number in the human genome Richard Redon, Shumpei Ishikawa,3, Karen R. Fitch, Lars Feuk,, George H. Perry 7, T. Daniel Andrews, Heike Fiegler, Michael

More information

Understanding DNA Copy Number Data

Understanding DNA Copy Number Data Understanding DNA Copy Number Data Adam B. Olshen Department of Epidemiology and Biostatistics Helen Diller Family Comprehensive Cancer Center University of California, San Francisco http://cc.ucsf.edu/people/olshena_adam.php

More information

Pedigree Construction Notes

Pedigree Construction Notes Name Date Pedigree Construction Notes GO TO à Mendelian Inheritance (http://www.uic.edu/classes/bms/bms655/lesson3.html) When human geneticists first began to publish family studies, they used a variety

More information

Unit 5 Review Name: Period:

Unit 5 Review Name: Period: Unit 5 Review Name: Period: 1 4 5 6 7 & give an example of the following. Be able to apply their meanings: Homozygous Heterozygous Dominant Recessive Genotype Phenotype Haploid Diploid Sex chromosomes

More information

BIOLOGY - CLUTCH CH.15 - CHROMOSOMAL THEORY OF INHERITANCE

BIOLOGY - CLUTCH CH.15 - CHROMOSOMAL THEORY OF INHERITANCE !! www.clutchprep.com Chromosomal theory of inheritance: chromosomes are the carriers of genetic material. Independent Assortment alleles for different characters sort independently of each other during

More information

Complex Traits Activity INSTRUCTION MANUAL. ANT 2110 Introduction to Physical Anthropology Professor Julie J. Lesnik

Complex Traits Activity INSTRUCTION MANUAL. ANT 2110 Introduction to Physical Anthropology Professor Julie J. Lesnik Complex Traits Activity INSTRUCTION MANUAL ANT 2110 Introduction to Physical Anthropology Professor Julie J. Lesnik Introduction Human variation is complex. The simplest form of variation in a population

More information

CNV detection. Introduction and detection in NGS data. G. Demidov 1,2. NGSchool2016. Centre for Genomic Regulation. CNV detection. G.

CNV detection. Introduction and detection in NGS data. G. Demidov 1,2. NGSchool2016. Centre for Genomic Regulation. CNV detection. G. Introduction and detection in NGS data 1,2 1 Genomic and Epigenomic Variation in Disease group, Centre for Genomic Regulation 2 Universitat Pompeu Fabra NGSchool2016 methods: methods Outline methods: methods

More information

GENETICS - NOTES-

GENETICS - NOTES- GENETICS - NOTES- Warm Up Exercise Using your previous knowledge of genetics, determine what maternal genotype would most likely yield offspring with such characteristics. Use the genotype that you came

More information

Analysis of CGH and SNP arrays for the detection of chromosomal aberrations in single cells

Analysis of CGH and SNP arrays for the detection of chromosomal aberrations in single cells Analysis of CGH and SNP arrays for the detection of chromosomal aberrations in single cells Peter Konings 1 Evelyne Vanneste 1,2 Thierry Voet 1 Cédric Le Caignec 1 Michèle Ampe 1 Cindy Melotte 1 Sophie

More information

Optimizing Copy Number Variation Analysis Using Genome-wide Short Sequence Oligonucleotide Arrays

Optimizing Copy Number Variation Analysis Using Genome-wide Short Sequence Oligonucleotide Arrays Optimizing Copy Number Variation Analysis Using Genome-wide Short Sequence Oligonucleotide Arrays The Harvard community has made this article openly available. Please share how this access benefits you.

More information

Exam #2 BSC Fall. NAME_Key correct answers in BOLD FORM A

Exam #2 BSC Fall. NAME_Key correct answers in BOLD FORM A Exam #2 BSC 2011 2004 Fall NAME_Key correct answers in BOLD FORM A Before you begin, please write your name and social security number on the computerized score sheet. Mark in the corresponding bubbles

More information

The vagaries of non-traditional mendelian recessive inheritance in uniparental disomy: AA x Aa = aa!

The vagaries of non-traditional mendelian recessive inheritance in uniparental disomy: AA x Aa = aa! Atlas of Genetics and Cytogenetics in Oncology and Haematology OPEN ACCESS JOURNAL AT INIST-CNRS Deep Insight Section The vagaries of non-traditional mendelian recessive inheritance in uniparental disomy:

More information

On Missing Data and Genotyping Errors in Association Studies

On Missing Data and Genotyping Errors in Association Studies On Missing Data and Genotyping Errors in Association Studies Department of Biostatistics Johns Hopkins Bloomberg School of Public Health May 16, 2008 Specific Aims of our R01 1 Develop and evaluate new

More information

Title. general populations. Author(s) ichiro. Citation Gene, 512(2), pp ; Issue Date

Title. general populations. Author(s) ichiro. Citation Gene, 512(2), pp ; Issue Date NAOSITE: Nagasaki University's Ac Title Author(s) Uniparental disomy analysis in trio whole-genome sequencing data imply general populations Sasaki, Kensaku; Mishima, Hiroyuki; ichiro Citation Gene, 512(2),

More information

Introduction to Genetics and Genomics

Introduction to Genetics and Genomics 2016 Introduction to enetics and enomics 3. ssociation Studies ggibson.gt@gmail.com http://www.cig.gatech.edu Outline eneral overview of association studies Sample results hree steps to WS: primary scan,

More information

DETECTING HIGHLY DIFFERENTIATED COPY-NUMBER VARIANTS FROM POOLED POPULATION SEQUENCING

DETECTING HIGHLY DIFFERENTIATED COPY-NUMBER VARIANTS FROM POOLED POPULATION SEQUENCING DETECTING HIGHLY DIFFERENTIATED COPY-NUMBER VARIANTS FROM POOLED POPULATION SEQUENCING DANIEL R. SCHRIDER * Department of Biology and School of Informatics and Computing, Indiana University, 1001 E Third

More information

HST.161 Molecular Biology and Genetics in Modern Medicine Fall 2007

HST.161 Molecular Biology and Genetics in Modern Medicine Fall 2007 MIT OpenCourseWare http://ocw.mit.edu HST.161 Molecular Biology and Genetics in Modern Medicine Fall 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Non-Mendelian inheritance

Non-Mendelian inheritance Non-Mendelian inheritance Focus on Human Disorders Peter K. Rogan, Ph.D. Laboratory of Human Molecular Genetics Children s Mercy Hospital Schools of Medicine & Computer Science and Engineering University

More information

Chapter 15: The Chromosomal Basis of Inheritance

Chapter 15: The Chromosomal Basis of Inheritance Name Chapter 15: The Chromosomal Basis of Inheritance 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes 1. What is the chromosome theory of inheritance? 2. Explain the law

More information

Name: PS#: Biol 3301 Midterm 1 Spring 2012

Name: PS#: Biol 3301 Midterm 1 Spring 2012 Name: PS#: Biol 3301 Midterm 1 Spring 2012 Multiple Choice. Circle the single best answer. (4 pts each) 1. Which of the following changes in the DNA sequence of a gene will produce a new allele? a) base

More information

Patterns of Single-Gene Inheritance Cont.

Patterns of Single-Gene Inheritance Cont. Genetic Basis of Disease Patterns of Single-Gene Inheritance Cont. Traditional Mechanisms Chromosomal disorders Single-gene gene disorders Polygenic/multifactorial disorders Novel mechanisms Imprinting

More information

Associating Copy Number and SNP Variation with Human Disease. Autism Segmental duplication Neurobehavioral, includes social disability

Associating Copy Number and SNP Variation with Human Disease. Autism Segmental duplication Neurobehavioral, includes social disability Technical Note Associating Copy Number and SNP Variation with Human Disease Abstract The Genome-Wide Human SNP Array 6.0 is an affordable tool to examine the role of copy number variation in disease by

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Fig 1. Comparison of sub-samples on the first two principal components of genetic variation. TheBritishsampleisplottedwithredpoints.The sub-samples of the diverse sample

More information

Supplementary information. Supplementary figure 1. Flow chart of study design

Supplementary information. Supplementary figure 1. Flow chart of study design Supplementary information Supplementary figure 1. Flow chart of study design Supplementary Figure 2. Quantile-quantile plot of stage 1 results QQ plot of the observed -log10 P-values (y axis) versus the

More information

A gene is a sequence of DNA that resides at a particular site on a chromosome the locus (plural loci). Genetic linkage of genes on a single

A gene is a sequence of DNA that resides at a particular site on a chromosome the locus (plural loci). Genetic linkage of genes on a single 8.3 A gene is a sequence of DNA that resides at a particular site on a chromosome the locus (plural loci). Genetic linkage of genes on a single chromosome can alter their pattern of inheritance from those

More information

Figure S2. Distribution of acgh probes on all ten chromosomes of the RIL M0022

Figure S2. Distribution of acgh probes on all ten chromosomes of the RIL M0022 96 APPENDIX B. Supporting Information for chapter 4 "changes in genome content generated via segregation of non-allelic homologs" Figure S1. Potential de novo CNV probes and sizes of apparently de novo

More information

CHAPTER 10 BLOOD GROUPS: ABO AND Rh

CHAPTER 10 BLOOD GROUPS: ABO AND Rh CHAPTER 10 BLOOD GROUPS: ABO AND Rh The success of human blood transfusions requires compatibility for the two major blood group antigen systems, namely ABO and Rh. The ABO system is defined by two red

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC.

Nature Genetics: doi: /ng Supplementary Figure 1. Rates of different mutation types in CRC. Supplementary Figure 1 Rates of different mutation types in CRC. (a) Stratification by mutation type indicates that C>T mutations occur at a significantly greater rate than other types. (b) As for the

More information

Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University

Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University Role of Chemical lexposure in Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University CNV Discovery Reference Genetic

More information

Genetic Assessment and Counseling

Genetic Assessment and Counseling Genetic Assessment and Counseling Genetic counseling is the communication of information and advice about inherited conditions and a person seeking such advice is called a consultand. This process includes

More information

Statistical Evaluation of Sibling Relationship

Statistical Evaluation of Sibling Relationship The Korean Communications in Statistics Vol. 14 No. 3, 2007, pp. 541 549 Statistical Evaluation of Sibling Relationship Jae Won Lee 1), Hye-Seung Lee 2), Hyo Jung Lee 3) and Juck-Joon Hwang 4) Abstract

More information

Mosaic loss of chromosome Y in peripheral blood is associated with shorter survival and higher risk of cancer

Mosaic loss of chromosome Y in peripheral blood is associated with shorter survival and higher risk of cancer Supplementary Information Mosaic loss of chromosome Y in peripheral blood is associated with shorter survival and higher risk of cancer Lars A. Forsberg, Chiara Rasi, Niklas Malmqvist, Hanna Davies, Saichand

More information

Genetics and Genomics in Medicine Chapter 8 Questions

Genetics and Genomics in Medicine Chapter 8 Questions Genetics and Genomics in Medicine Chapter 8 Questions Linkage Analysis Question Question 8.1 Affected members of the pedigree above have an autosomal dominant disorder, and cytogenetic analyses using conventional

More information

Challenges of CGH array testing in children with developmental delay. Dr Sally Davies 17 th September 2014

Challenges of CGH array testing in children with developmental delay. Dr Sally Davies 17 th September 2014 Challenges of CGH array testing in children with developmental delay Dr Sally Davies 17 th September 2014 CGH array What is CGH array? Understanding the test Benefits Results to expect Consent issues Ethical

More information

Chapter 11 Patterns of Chromosomal Inheritance

Chapter 11 Patterns of Chromosomal Inheritance Inheritance of Chromosomes How many chromosomes did our parents gametes contain when we were conceived? 23, 22 autosomes, 1 sex chromosome Autosomes are identical in both male & female offspring For the

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION CONTENTS A. AUTISM SPECTRUM DISORDER (ASD) SAMPLE AND CONTROL COLLECTIONS 4 ASD samples 4 Control cohorts 4 B. GENOTYPING AND DATA CLEANING 6 SNP quality control 6 Intensity quality control for CNV detection

More information

Unit 8.1: Human Chromosomes and Genes

Unit 8.1: Human Chromosomes and Genes Unit 8.1: Human Chromosomes and Genes Biotechnology. Gene Therapy. Reality or fiction? During your lifetime, gene therapy may be mainstream medicine. Here we see a representation of the insertion of DNA

More information

Lab 5: Testing Hypotheses about Patterns of Inheritance

Lab 5: Testing Hypotheses about Patterns of Inheritance Lab 5: Testing Hypotheses about Patterns of Inheritance How do we talk about genetic information? Each cell in living organisms contains DNA. DNA is made of nucleotide subunits arranged in very long strands.

More information

Statistical power and significance testing in large-scale genetic studies

Statistical power and significance testing in large-scale genetic studies STUDY DESIGNS Statistical power and significance testing in large-scale genetic studies Pak C. Sham 1 and Shaun M. Purcell 2,3 Abstract Significance testing was developed as an objective method for summarizing

More information

Imputation of Missing Genotypes from Sparse to High Density using Long-Range Phasing

Imputation of Missing Genotypes from Sparse to High Density using Long-Range Phasing Genetics: Published Articles Ahead of Print, published on June July 29, 24, 2011 as 10.1534/genetics.111.128082 1 2 Imputation of Missing Genotypes from Sparse to High Density using Long-Range Phasing

More information

Supplementary Information. Supplementary Figures

Supplementary Information. Supplementary Figures Supplementary Information Supplementary Figures.8 57 essential gene density 2 1.5 LTR insert frequency diversity DEL.5 DUP.5 INV.5 TRA 1 2 3 4 5 1 2 3 4 1 2 Supplementary Figure 1. Locations and minor

More information

Structural Variants and Susceptibility to Common Human Disorders Dr. Xavier Estivill

Structural Variants and Susceptibility to Common Human Disorders Dr. Xavier Estivill Structural Variants and Susceptibility Genetic Causes of Disease Lab Genes and Disease Program Center for Genomic Regulation (CRG) Barcelona 1 Complex genetic diseases Changes in prevalence (>10 fold)

More information

Basic Definitions. Dr. Mohammed Hussein Assi MBChB MSc DCH (UK) MRCPCH

Basic Definitions. Dr. Mohammed Hussein Assi MBChB MSc DCH (UK) MRCPCH Basic Definitions Chromosomes There are two types of chromosomes: autosomes (1-22) and sex chromosomes (X & Y). Humans are composed of two groups of cells: Gametes. Ova and sperm cells, which are haploid,

More information

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations.

Supplementary Figure 1. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. Supplementary Figure. Principal components analysis of European ancestry in the African American, Native Hawaiian and Latino populations. a Eigenvector 2.5..5.5. African Americans European Americans e

More information

Multiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016

Multiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016 Multiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016 Marwan Tayeh, PhD, FACMG Director, MMGL Molecular Genetics Assistant Professor of Pediatrics Department of Pediatrics

More information

Multimarker Genetic Analysis Methods for High Throughput Array Data

Multimarker Genetic Analysis Methods for High Throughput Array Data Multimarker Genetic Analysis Methods for High Throughput Array Data by Iuliana Ionita A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department

More information

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al.

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Holger Höfling Gad Getz Robert Tibshirani June 26, 2007 1 Introduction Identifying genes that are involved

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature13908 Supplementary Tables Supplementary Table 1: Families in this study (.xlsx) All families included in the study are listed. For each family, we show: the genders of the probands and

More information

Chapter 02 Mendelian Inheritance

Chapter 02 Mendelian Inheritance Chapter 02 Mendelian Inheritance Multiple Choice Questions 1. The theory of pangenesis was first proposed by. A. Aristotle B. Galen C. Mendel D. Hippocrates E. None of these Learning Objective: Understand

More information

Modeling genetic inheritance of copy number variations

Modeling genetic inheritance of copy number variations Published online 2 October 2008 Nucleic Acids Research, 2008, Vol. 36, No. 21 e138 doi:10.1093/nar/gkn641 Modeling genetic inheritance of copy number variations Kai Wang 1,2, *, Zhen Chen 3, Mahlet G.

More information

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin,

During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin, ESM Methods Hyperinsulinemic-euglycemic clamp procedure During the hyperinsulinemic-euglycemic clamp [1], a priming dose of human insulin (Novolin, Clayton, NC) was followed by a constant rate (60 mu m

More information

Human population sub-structure and genetic association studies

Human population sub-structure and genetic association studies Human population sub-structure and genetic association studies Stephanie A. Santorico, Ph.D. Department of Mathematical & Statistical Sciences Stephanie.Santorico@ucdenver.edu Global Similarity Map from

More information

Laws of Inheritance. Bởi: OpenStaxCollege

Laws of Inheritance. Bởi: OpenStaxCollege Bởi: OpenStaxCollege The seven characteristics that Mendel evaluated in his pea plants were each expressed as one of two versions, or traits. Mendel deduced from his results that each individual had two

More information

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from Supplementary Figure 1 SEER data for male and female cancer incidence from 1975 2013. (a,b) Incidence rates of oral cavity and pharynx cancer (a) and leukemia (b) are plotted, grouped by males (blue),

More information

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach)

Single SNP/Gene Analysis. Typical Results of GWAS Analysis (Single SNP Approach) Typical Results of GWAS Analysis (Single SNP Approach) High-Throughput Sequencing Course Gene-Set Analysis Biostatistics and Bioinformatics Summer 28 Section Introduction What is Gene Set Analysis? Many names for gene set analysis: Pathway analysis Gene set

More information

Introduction to LOH and Allele Specific Copy Number User Forum

Introduction to LOH and Allele Specific Copy Number User Forum Introduction to LOH and Allele Specific Copy Number User Forum Jonathan Gerstenhaber Introduction to LOH and ASCN User Forum Contents 1. Loss of heterozygosity Analysis procedure Types of baselines 2.

More information

Supplementary Information. Data Identifies FAN1 at 15q13.3 as a Susceptibility. Gene for Schizophrenia and Autism

Supplementary Information. Data Identifies FAN1 at 15q13.3 as a Susceptibility. Gene for Schizophrenia and Autism Supplementary Information A Scan-Statistic Based Analysis of Exome Sequencing Data Identifies FAN1 at 15q13.3 as a Susceptibility Gene for Schizophrenia and Autism Iuliana Ionita-Laza 1,, Bin Xu 2, Vlad

More information

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY.

SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY. SAMPLE REPORT SNP Array NOTE: THIS IS A SAMPLE REPORT AND MAY NOT REFLECT ACTUAL PATIENT DATA. FORMAT AND/OR CONTENT MAY BE UPDATED PERIODICALLY. RESULTS SNP Array Copy Number Variations Result: GAIN,

More information

Research Strategy: 1. Background and Significance

Research Strategy: 1. Background and Significance Research Strategy: 1. Background and Significance 1.1. Heterogeneity is a common feature of cancer. A better understanding of this heterogeneity may present therapeutic opportunities: Intratumor heterogeneity

More information