Systematic Analysis for Identification of Genes Impacting Cancers
|
|
- Abigayle Goodman
- 5 years ago
- Views:
Transcription
1 Systematic Analysis for Identification of Genes Impacting Cancers Arpita Singhal Stanford University Saint Francis High School ABSTRACT Currently, vast amounts of molecular information involving genomic characterizations exist for various types of cancers. However, the integration of the various forms of biological data, necessary for a better understanding of the key processes underlying cancer, remains challenging. This project uses microarray based comparative genomic hybridization (acgh) data to study genomic alterations on various tumor samples, with the statistical procedures in R. To find the hidden copy number states of each chromosome to characterize genomic alterations, this project utilizes Hidden Markov Models on datasets from cancer patients. The efficacy of the homogeneous and heterogeneous Hidden Markov Models is evaluated against the known truth of simulated data by looking at the true positive rates and false discovery rates for breakpoint detection. This project mainly determines the number and types of copy number variations present in the chromosomes of the tumor datasets, obtained from the The Cancer Genome Atlas portal. Recurrent chromosomal aberrations at particular genome locations may indicate the presence of tumor suppressor genes or oncogenes. After recognizing the chromosomes with high copy number changes, genes causing these high copy number variations are identified. The association between chromosomal location and cancer phenotype provides a more reliable and informative cancer genome characterization that can lead to useful insights into cancer biology for further disease classification, prognosis, and personalized treatment. INTRODUCTION A central issue in cancer biology is the identification of specific chromosomal regions that are involved in cancer progression and other biological processes. Unbalanced chromosomal abnormalities, that result in gains and losses of chromosomal segments, often cause several human genetic disorders, including cancer. Driven by an accumulation of genetic and epigenetic changes, tumors represent altered levels of gene expression and the disruption of normal cell growth and survival. A variety of cancers exhibit gains in protooncogenes and losses in tumor suppressor genes; thus, growth-limiting functions and self-repair processes of cancerous regions are often seriously harmed. The genomic alterations, observed in tumors, reflect underlying failures in the maintenance of genetic stability. Copy Number Variations (CNVs) Copy Number Variations collectively describe deletions, insertions, duplications, and other complex variants present in the human genome. Redon et al. (2006) defined a CNV as a DNA segment of one kilobase or larger that is present at a variable copy number in comparison to a reference genome. A CNV can be simple in structure, such as a duplication, or it may involve complex gains or losses of homologous sequences at multiple sites in the genome. Chromosomal copy numbers are defined to be 2 for normal cells, 1 or 0 for single and double deletions, and 3 or higher for single copy gains or higher level amplifications. Figure 1 shows the various forms of chromosome changes. Cancer progression is usually a result of copy number variations, which may represent the over-expression of proto-oncogenes or down-regulation of tumor suppressor genes in cancer genomes. Structural variations, such as CNVs, influence the expression of different phenotypic traits and are found to impact various diseases and affect the development of tumors. DNA copynumber variations are used in cancer research, by searching for novel genes involved in cancers through the analysis of genes located in specific regions. Thereby, it is of considerable importance to identify as precisely as possible the chromosomal regions with abnormal copy numbers. 65
2 Array CGH Through the use of microarray based comparative genomic hybridization, the regions of genes with altered copy numbers can be identified. This technique characterizes the relationship between target sequences on an unknown test genome and reference genome. Array CGH has been developed to identify CNV expression within cancerous regions. As an indispensable tool to understand disease mechanisms, acgh detects and maps changes in the copy number of DNA sequences and can be used to analyze tumor genomes and chromosomal aberrations. The log-ratio values, obtained from the acgh data, are used as the emissions in the Hidden Markov Model, in order to find the hidden states representing the copy numbers of the chromosomes. This technique uses a test DNA sample, such as tumor genomic DNA, and a reference DNA sample, such as normal genomic DNA, that are both labeled with different fluorescent dyes. The DNA samples are then combined with unlabeled Cot-1 DNA, a reagent used to block repetitive DNA sequences and prevent nonspecific hybridization. The two samples are hybridized together onto a microarray, and a microarray scanner is used to measure the fluorescent signals and capture digital images. The fluorescence intensity signals from labeled DNA, hybridized on target probes, are processed and normalized. The difference between the intensity signals of each probe from the test and reference genomes is expressed as a log ratio and can be analyzed to detect genomic alterations and aberrations. In the ideal case, the log ratio is equal to 0, demonstrating that no copy change has occurred in that region of the genome; however, a higher or lower log ratio implies a change in copy number. The calculation of the log ratios determines the copy number variation. The log ratio always changes due to the test intensity while the reference intensity stays constant at 2, representing the homozygous phenotype in the normal sample. When the tumor sample has no copy of the particular region identified on the chromosome, a value log2(0/2) equal to infinity is seen indicating that region of the chromosome has experienced a homozygous deletion. The log2(1/2) value is observed when the copy number is equal to 1; since log2(1/2) is equal to -1, a heterozygous deletion has occurred. When the tumor intensity is equal to 3, the log2 ratio of (3/2) is calculated and results in 0.585, implying that a heterozygous duplication has taken place. Lastly, when the tumor intensity is equal to 4, the log2 ratio of (4/2) is calculated and results in 1 and implies that a homozygous duplication has occurred. The array CGH is further analyzed with appropriate statistical methods. A log ratio greater than 1 represents a higher number of target sequences in the test genome when compared to the reference genome; conversely, a log ratio less than one indicates a lower number of target sequences in the test genome. However, the complexity of eukaryotic genomes often causes the total signal of a microarray hybridization to be diluted and makes acgh data noisy and inappropriate in determining the accurate copy number of a region. Thus, methods that can accurately use acgh data must be implemented. The analysis of acgh data can help determine the location of DNA copy number aberrations within the tumor genome for improved cancer diagnosis, drug development, and molecular therapy. A representation of the micro-array based comparative genomic hybridization is shown in Figure 2. With more array CGH data sets emerging, more efficient algorithms that detect regions of gains and losses are necessary to provide an accurate estimate of error for the detection. The research conducted for this study uses an algorithm to categorize the chromosomes based on the types of copy number aberrations to accurately identify genes relevant to tumor progression.the objectives of this project are (1) analyze cancer genomic data in order to predict the hidden number states for each chromosomal region and (2) use the hidden number states of each region to accurately identify proto-oncogenes and tumor suppressor genes. 66
3 General approach used in this project The approach used in this project can be divided into the following six steps which are discussed in detail later. 1. Upload data from Data Portal 2. Normalization of Data 3. Segmentation of data 4. Applying Hidden Markov model 5. Results a. Comparing the Efficacy of the Hidden Markov Models for True Positive Rate (TPR) and False Discovery Rate (FDR) b. Detection of Genes through Analysis of gains and Losses Previous Approaches forarray CGH Data Analysis With more array CGH data sets emerging, more efficient algorithms that detect regions of gains or losses and provide an accurate estimate of error for the detection are necessary. Previously, researchers have devised means for analyzing the array CGH data sets. Wang et al. (2004) used the method of Clustering Along Chromosomes to detect the signal regions by depicting the spatial structure within genomic alterations. Olshen et al. (2004) utilized circular binary segmentation to segment a chromosome into connecting regions and illustrate a parametric model of the data with its use of a permutation reference distribution. However, these methods do not take into account the various biological covariates, including the distance between clones, that impact segmentation of the array CGH data. The research conducted for this study uses an algorithm to categorize the chromosomes based on the types of copy number aberrations to accurately identify genes relevant to tumor progression. The Cancer Genome Atlas (TCGA) The array CGH data used for this project was obtained from the TCGA data portal. This platform allows access to data sets, and it provides various types of data, including clinical information, genomic characterization data, and high level sequence analysis of the tumor genomes. METHODS Data The data used for this project was obtained from the TCGA platform. GBM Level 1 Array CGH data, from the Agilent Human Genome CGH Microarray 244A platform processed at the Harvard Medical School Center, was downloaded from the TCGA data portal. Level 1 data represents raw signals per probe for each participant s tumor sample. All data sets were processed using the R packages Bioconductor (Gentleman, 2004), limma (Smyth and Speed, 2003), and snapcgh (Smith, 2009). Data Normalization during Pre-Processing Raw array CGH data often has many experimental and biological factors that make it difficult to identify the true copy number for a genomic clone. Biological factors include the purity and ploidy of a sample. In order to correct this issue, background correction and normalization techniques were performed on each array. With normalization, the ploidy of the reference sample no longer played a role. The arrays were normalized using the normalizewithinarrays() function within the limma package. This function normalized the expression log ratios for two-color spotted microarray experiments, so that the log ratios averaged to zero within each array. The backgroundcorrect() function, also within the limma package, was used to correct the background of the microarray expression intensities by subtracting the average signal intensity of the area between spots. 67
4 Segmentation of Data Each array CGH was processed using the processcgh() function from the snapcgh package. This function used the normalized MAList, that contained the log expression ratios and was created by the normalization and background correction. It, then, ordered and filtered the clones based on the mapping information of the log ratios. Thus, the datasets were segmented. Using segmentation models, specific segments were identified and the segment variance of log ratio values was minimized. Hidden Markov Models (HMMs) Hidden Markov Models are a formal foundation for making probabilistic models of sequence labeling problems (Eddy, 2004). An HMM indicates a finite set of states, with each set containing emission probability distributions and specific transition probabilities between states. At each state, a residue is produced from the state s emission probability distribution. Then, the next state is chosen based on the state s transition probability distribution. The model thus generates two sets of information: the underlying state path, which is created while transitioning from state to state and is hidden, and the observed sequence, which is the residue emitted from each state in the state path. Because HMMs can effectively uncover the relationship between the underlying states and the observed emissions, they are useful in analyzing array CGH data. The log ratios obtained from the array CGH data are the emissions, and the underlying states are the copy number values of each region on the chromosome and correspond to the emissions, based on specific probabilities. Two types of HMMs exist to identify the underlying states of the array CGH data, representing the copy number aberrations: the homogenous model and the heterogeneous model, which both have their own distinct advantages. The former option, the homogenous HMM, estimates the number of hidden states via model selection and performs an analysis for each chromosome. It regards the underlying states as segments of a common mean that represent the copy number values of each region. The homogenous HMM assumes that the transition probability matrix is the same at the each state and thus does not consider the distance between clones. To fit the unsupervised homogenous HMM for each dataset, methods in the Bioconductor package snapcgh were used; the function runhomhmm() was used to discover the hidden copy numbers for each chromosome from the patient datasets for GBM. On the other hand, the heterogeneous HMM utilizes transition probabilities that are dependent on the distance between clones; furthermore, the probability of remaining in the same hidden state is a decreasing function of the distance between one probe and the probe before it. When the distance between two clones is maximized, the state of a probe is not affected by the state of the previous clone. The function, runbiohmm() was used for the heterogeneous HMM. This project uses both the homogenous and heterogeneous HMMs to identify which one assesses the copy number variations more accurately using simulated data and the corresponding True Positive Rates and False Discovery Rates. RESULTS First, the efficacy of the homogeneous HMM was compared to that of the heterogeneous HMM. A three-step algorithm was then used to identify the altered chromosomal regions in the cancer data. The three steps of the algorithm consist of the data pre-processing and segmentation of data, the identification of the hidden copy number states of the cancer data using the HMMs, and the quantification of the specific gains or losses to detect the genes in the regions of interest. The three-step algorithm is applied on array CGH GBM datasets for five different patients. True Positive Rate and False Discovery Rate The efficacy of the homogeneous and heterogeneous HMMs is evaluated against the known truth of the simulated data by looking at the true positive and false discovery rates for breakpoint detection, as seen in Figure 3. The data was simulated using the simulatedata() method in the snapcgh package. This function simulates acgh data, and this function was used to create 10 arrays to account for variation in copy number data. The comparesegmentations() method was used to create a matrix, consisting of the true positive rates 68
5 and the false discovery rates for each HMM; this function evaluates the performance of the segmentation method to the known truth of the simulated data. The boxplot() function was used to generate a plot of the rates to effectively compare the two HMMs. The true positive rates and the false discovery rates of both the homogenous and heterogeneous HMMs demonstrated that the heterogenous HMM was more successful in identifying the copy number values accurately. Normalization, Background Correction, and Segmentation The first step, the data pre-processing, helped eliminate any background errors within the data using normalization and background correction methods. These methods allowed for the next steps to become less likely to experience error. Segmentation was carried out using the snapcgh package in R which first splits each dataset into various segments based on the variation of copy number. Then the unsupervised HMM was used to find the copy number states of each chromosome. After the segmentation, the smoothed log ratios for each patient s data were plotted, as shown in Figure 4. Each figure represents the dataset from a different GBM patient and demonstrates the log ratios of each patient plotted against the kilobase. The different colors represent the twenty-four total chromosomes in the human genome. These log ratios were used as the emissions in the HMM, necessary for determining the copy number states of each patient. Use of the Hidden Markov Models Both the homogeneous HMM and heterogeneous HMM were used to identify the copy number states of each chromosome for every patient; however, only results from the heterogeneous HMM are shown because of its higher efficacy rates. Figure 5 displays the plots of the hidden states of each patient that were found for each chromosome. The plot for Patient 1 shows up-regulation of genetic data in somatic chromosomes 5 and 14 and sex chromosome Y, which is shown as chromosome 24; down-regulation of genetic data is observed in chromosomes 4 and 21. The plot of the states for Patient 2 demonstrates upregulation in chromosomes 2, 4, 5, 7, 8, 9, 12, 14, 20, 21, and 23; down-regulation is seen in somatic chromosomes 1, 16, 18, and sex Chromosome Y. The states of Patient 3 show few copy number changes: somatic chromosomes 10 and 15 and sex chromosome Y have an increased copy number, and chromosome 1 has a decrease in copy number. On the other hand, the states of Patient 4 show greater copy number variance. Somatic chromosomes 3, 12, 14, 15, 16, 17, and 22 and sex chromosomes X and Y all demonstrate a greater copy number, and this patient has no losses in genetic data. Lastly, Patient 5 also has several copy number gains in somatic chromosomes 2, 6, 7, 9, 12, 13, 14, 15, 20, and 22 and sex chromosome Y. While it is important to consider the fact that each individual s genome consists of several mutations and some copy number variations, whole chromosomal aberrations are quite often indications of disease. The identification of the copy number states of each chromosome in the genomes of cancer patients is useful for identifying common chromosomes that may impact the progression of the Glioblastoma Multiforme tumor. If observed in several tumors, genes can be identified as oncogenes or tumor suppressor genes through the analysis of the specific chromosomal position. In addition, the individual variance in copy number of each chromosome for each patient allows for personalized treatment. Identification of Specific Gains or Losses The third and final step was conducted by comparing the log ratio plots of the five patient samples, as seen in Figure 4, and identifying the common regions with similar gains or losses and mapping those regions to specific genes. While some datasets displayed a more drastic change in the log ratios as compared to the other datasets, a majority of the datasets exhibited an elevated copy number at chromosomes 12 and Y and a decreased copy number at chromosome 1. The chromosome numbers are identified through the heterogeneous HMM analysis on single chromosomes. The variance in copy number among the different patients can be attributed to the diversity of genomic data from individual to individual. While each patient s genome may represent common gains and losses, there are several external conditions that influence the expression of 69
6 regions of the genome, including the patient s age and medical history. The gains or losses of certain chromosomal regions were identified using the plots of the copy numbers that rely on the log ratios. DISCUSSION This project aims to design an algorithm that can identify the copy number states for each chromosome. Remarkably, the method yields interesting data for analysis. This project applies the methods on Glioblastoma multiforme array CGH data to figure out the copy number states for each chromosome. It also efficiently matches the corresponding copy number gain or loss to a certain region of interest, that may be involved in the progression of the tumor. The results from this project can be used for improved and personalized treatment by identifying genes that are up-regulated or under-expressed. Each data set obtained from a different patient, while being affected by the same disease, has some differing log ratios and copy numbers. The variance in copy number among patients is due to the factors, including environmental and hereditary information, that impact the log ratios and, thus, the copy number variations. For further research, patient medical history, age, and other medical factors can be included in the study in order to more accurately study chromosomal aberrations that are involved in GBM. Some similar regions of interest were identified amongst the GBM patients. Most of the datasets contain duplications at Chromosome 12. Using the GeneName data, the original names of the genes, attributing to the elevated copy number were found. Chromosome 12 contains the genes, PDE3A and ST8SIA1. PDE3A, or Phosphodiesterase 3A, plays a critical role in many cellular processes by regulating the amplitude and duration of the intracellular cyclic nucleotide signals. ST8SIA1, or ST8 Alpha-N-Acetyl- Neuraminide Alpha-2,8-Sialyltransferase 1, is important for cell adhesion and growth of malignant cells. The dysregulation of these genes may attribute to the progression of cancer as these genes are important in maintaining cell processes and seem to affect the growth of malignant cells. In chromosome 1, genes AMY2A and KIFAP2 were under-expressed; this decrease in expression may have caused the cells to stop functioning normally and thus encouraged tumor growth. Additionally, heterozygous and homozygous duplications were seen near Chromosomes 19, which contains genes MLL4 and PSENEN genes. MLL4, or Myeloid lymphoid or mixed-lineage leukemia 4, is most commonly seen in luekemia; however, it is often amplified in tumor cell lines and may be involved in the formation of the GBM tumor. Also, some patients had an increased copy number at chromosome 7, which represents the amplification in the Epidermal Growth Factor Receptor (EGFR) gene that causes cells to grow and divide. EGFR is a highly prominent oncogene present in various types of cancer, including GBM and Lung Cancers. In addition to the genes identified across all samples, genes specific to certain patients can be used for more personalized treatment. CONCLUSION This project has successfully utilized array CGH data to discover various genes that may impact the formation and progression of the GBM tumor in patients. The copy number phenotype discovered for each cancer patient is associated with a known biological marker that may be associated with the progression of the cancer, either by its overexpression or underexpression. If the gene is over-expressed, it is most likely an oncogene that causes cells to grow and divide, as observed in cancers. When the gene is under-expressed, the gene may be a cause of the tumor development because it is probably an important cell cycle gene, that suppresses the formation of tumors in cells. The resulting copy number phenotype, determined with the HMM used in this project, is associated with biological markers that may be previously unassociated with the cancer phenotype. This association will help provide the most reliable and informative genome characterization of cancer and the development of more specialized disease classification, prognosis, and personalized treatment for the cancer patient. Since this algorithm has been used on Level 1 data, this project has successfully demonstrated the analysis of the raw data by normalization, segmentation, and implementation of the HMMs to identify cancer biomarkers for the development of a better and more personalized form of treatment for patients affected with 70
7 GBM. For further research, the algorithm used in this project can be used on more GBM datasets to more successfully find the biological markers that may cause the formation of the brain tumor within the cancer patients. Additionally, the algorithm used in this project can be utilized on other cancer types for a similar analysis of cancer biomarkers. While incorporating this algorithm, other medical factors can be taken into account to eliminate any interference in the study of the copy number variations. Further research can be conducted that will standardize the data to incorporate factors, including the age and previous medical conditions of the patient. ACKNOWLEDGEMENTS I am grateful to Professor Susan Holmes from the Statistics Department at Stanford University for her valuable time, help, and guidance provided while I was conducting this project and taking the BioStatistics course; Professor Trevor Martin for his help during the BioStatistics course; and Julia Fukuyama for her advice on how to approach certain issues while using R. Also, my Physics-Honors teacher, Mrs. Segal, provided me with valuable advice while conducting my project. In addition, I am very grateful to Dr. Sean Davis, Staff Scientist at the Center for Cancer Research at the National Cancer Institute, for his valuable time and feedback provided while conducting this project. Also, I am thankful to my parents for their continuous support. ANNOTATED BIBLIOGRAPHY Eddy, Sean R. What Is a Hidden Markov Model. Nature.com. Nature Publishing Group, Web. 5 Oct This research article discusses the definition of a Hidden Markov Model. The author defines a Hidden Markov Model as a formal foundation for making probabilistic models of sequences by considering transition probabilities. His definition really encompasses the significance of this project, which uses Hidden Markov Models to find the underlying states from the given emissions. Additionally, the author uses examples based on the genetic sequences. Through this example, he notes that the sequence, in terms of A, C, T, and G, represents the overlying emissions, and the underlying state path is hidden and must be discovered through the use of the Hidden Markov Models, that contains transition probabilities. The author of this research article presents his research in a highly credible fashion since he first defines the Hidden Markov Model and then provides examples supporting his definition. In addition, he makes use of several sources from credible authors; for example, he cited Rabiner who conducted a tutorial on Hidden Markov Models. Dr. Sean R. Eddy works at Howard Hughes Medical Institute and the Department of Genetics at Washington University School of Medicine. He has authored research papers that have used Hidden Markov Models. Thus, he is a credible source as he has the knowledge necessary for defining and demonstrating what a Hidden Markov Model is. Olshen, A.B., E. S. Venkatraman, Robert Lucito, and Michael Wigler. Circular binary segmentation for the analysis of array based DNA copy number data. Biostat (2004) 5 (4): , doi: /biostatistics/kxh008. The research paper, Circular binary segmentation for the analysis of array based DNA copy number data, discusses another approach for analyzing array CGH data. They have utilized array CGH data and circular binary segmentation method to translate noisy intensity measurements into regions of equal copy number. They have applied this method on test breast cancer data, as well as simulated data with known copy number alterations to test the efficacy of their new method. They have effectively discovered another method for analyzing array CGH data to detect regions of gains and losses based on the segments that they found with their method. 71
8 The authors of this research paper present the research in a highly efficient and credible way as they have demonstrated a new development while applying it on simulated data and test data. Their method is one approach for analyzing array CGH data to obtain the over-expressed and down-regulated regions. Dr. Venkatraman is from the Department of Epidemiology and Biostatistics at the Memorial Sloan-Kettering Cancer Center; his position gives him the credibility for conducting this research paper. The other two authors, Robert Lucito and Michael Wigler also have significant experience in the cancer field as they conduct cancer research at the Cold Spring Harbor Laboratory in New York. Wang, P., Y. Kim, J. Pollack, B. Narasimhan, and R. Tibshirani. A Method for Calling Gains and Losses in Array CGH Data. Biostatistics 6.1 (2004): Web. This research paper focuses on the development of a new method for detecting gains and losses in Array CGH data. The authors utilize clustering to identify crucial regions. They have developed a new algorithm, Clustering along Chromosomes (CLAC) to detect specific regions. The CLAC builds hierarchical clustering-style trees along each chromosome arm or chromosome and then selects the interesting clusters by controlling the False Discovery Rates. They have applied the data on a lung cancer microarray CGH data set. Their clustering algorithm is iterative as it continues until a big cluster is formed, and it is based on the identification of specific clusters with one gene in each cluster, and then the two adjacent clusters are merged. The authors of this research paper all work in different departments at Stanford University and thus represent an interdisciplinary approach to this paper. The main author, Dr. Wang, works in the Statistics Department and thus is extremely knowledgeable in this field. Their research provides a valuable insight into another way of analyzing array CGH data, and underscores the necessity of analyzing array CGH data to find the regions that have demonstrated gains or losses for better disease treatment in the future. WORKS CITED Albertson, D.G. and Daniel Pinkel, Genomic microarrays in Human Genetic Disease and cancer. Hum. Mol. Genet. (2003) 12 (suppl 2): R145-R152, August 5, 2003, doi: /hmg/ddg261 Eddy, Sean R. What Is a Hidden Markov Model. Nature.com. Nature Publishing Group, Web. 5 Oct Gentleman, R.C., Vincent J. Carey, Douglas M. Bates, Ben Bolstad, Marcel Dett- ling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, TorstenHothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch Cheng Li, Martin Maechler, Anthony J. Rossini, Gunther Sawitzki, Colin Smith, Gordon Smyth, Luke Tierney, Jean Y. H. Yang, and Jianhua Zhang. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology, 5:R80, Marioni, J.C., N.P. Thorne, S. Tavare, F. Radyanyi. BioHMM: A heterogeneous Hidden Markov Model for Segmenting array CGH data. Bioinformatics.2006; 22: Olshen, A.B., E. S. Venkatraman, Robert Lucito, and Michael Wigler. Circular binary segmentation for the analysis of arraybased DNA copy number data. Biostat (2004) 5 (4): , doi: /biostatistics/kxh008. Rabiner, L.R., A Tutorial on Hidden Markov Model and Selected Applications in Speech Recognition. Proceedings of the IEEE, Volume 77, February 1989, Smith, M.L., John C. Marioni, Steven McKinney, Thomas Hardcastle and Natalie P. Thorne (2009). snapcgh: Segmentation, normalisation and processing of acgh data. R package version Redon, Richard, Shumpei Ishikawa, Karen R. Fitch, Lars Feuk, George H. Perry, T. Daniel Andrews, Heike Fiegler, Michael H. Shapero, Andrew R. Carson, Wenwei Chen, Eun Kyung Cho, Stephanie Dallaire, Jennifer L. Freeman, Juan R. González, MònicaGratacòs, Jing Huang, DimitriosKalaitzopoulos, Daisuke Komura, Jeffrey R. Macdonald, Christian R. Marshall, Rui Mei, Lyndal Montgomery, Kunihiro Nishimura, Kohji Okamura, Fan Shen, Martin J. Somerville, Joelle Tchinda, Armand 72
9 Valsesia, Cara Woodwark, Fengtang Yang, Junjun Zhang, Tatiana Zerjal, Jane Zhang, LluisArmengol, Donald F. Conrad, Xavier Estivill, Chris Tyler-Smith, Nigel P. Carter, Hiroyuki Aburatani, Charles Lee, Keith W. Jones, Stephen W. Scherer, and Matthew E. Hurles. "Global Variation in Copy Number in the Human Genome."Nature (2006): Web. Smyth, G.K. Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions using R and Bioconductor. R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds), Springer, New York, pages , Web. Wang, P., Y. Kim, J. Pollack, B. Narasimhan, and R. Tibshirani. A Method for Calling Gains and Losses in Array CGH Data. Biostatistics 6.1 (2004): Web. Zhang, N. DNA Copy Number Profiling in Normal and Tumor Genomes. Frontiers in Computational and Systems Biology.Vol. 15. London: Springer, Web. FIGURES Figure 1: Forms of chromosome changes. 73
10 Figure 2. Schematic Representation of Array CGH 74
11 Figure 3: Boxplots comparing the efficacy of the Hidden Markov Models 75
12 Figure 4: Log Ratios for five patients, that were used as the emissions in the Hidden Markov Models. 76
13 Figure 5: The states are identified with the Heterogeneous Hidden Markov Model for the five patients, and they range from 0 to 5 for the chromosomes, depending on the patient. 77
0.1% variance attributed to scattered single base-pair changes SNPs
April 2003, human genome project completed: 99.9% of genome identical in all humans 0.1% variance attributed to scattered single base-pair changes SNPs It has been long recognized that variation in the
More informationUnderstanding DNA Copy Number Data
Understanding DNA Copy Number Data Adam B. Olshen Department of Epidemiology and Biostatistics Helen Diller Family Comprehensive Cancer Center University of California, San Francisco http://cc.ucsf.edu/people/olshena_adam.php
More informationAbstract. Optimization strategy of Copy Number Variant calling using Multiplicom solutions APPLICATION NOTE. Introduction
Optimization strategy of Copy Number Variant calling using Multiplicom solutions Michael Vyverman, PhD; Laura Standaert, PhD and Wouter Bossuyt, PhD Abstract Copy number variations (CNVs) represent a significant
More informationStatistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies
Statistical Analysis of Single Nucleotide Polymorphism Microarrays in Cancer Studies Stanford Biostatistics Workshop Pierre Neuvial with Henrik Bengtsson and Terry Speed Department of Statistics, UC Berkeley
More informationIntegrated Analysis of Copy Number and Gene Expression
Integrated Analysis of Copy Number and Gene Expression Nexus Copy Number provides user-friendly interface and functionalities to integrate copy number analysis with gene expression results for the purpose
More informationStructural Variation and Medical Genomics
Structural Variation and Medical Genomics Andrew King Department of Biomedical Informatics July 8, 2014 You already know about small scale genetic mutations Single nucleotide polymorphism (SNPs) Deletions,
More informationCancer outlier differential gene expression detection
Biostatistics (2007), 8, 3, pp. 566 575 doi:10.1093/biostatistics/kxl029 Advance Access publication on October 4, 2006 Cancer outlier differential gene expression detection BAOLIN WU Division of Biostatistics,
More informationScience. Webinar Series. CNVs vs SNPs: 16 July, Participating Experts: Understanding Human Structural Variation in Disease
Science Webinar Series CNVs vs SNPs: 16 July, 2008 Understanding Human Structural Variation in Disease Brought to you by the Science/AAAS Business Office Participating Experts: Charles Lee, Ph.D. Harvard
More informationBIO-132 Population Genetics of Human Copy Number Variations:
BIO-132 Population Genetics of Human Copy Number Variations: Models and Simulation of their Evolution Along and Across the Genomes September 16, 2007 Abstract Population genetic models play a significant
More informationIntroduction to Discrimination in Microarray Data Analysis
Introduction to Discrimination in Microarray Data Analysis Jane Fridlyand CBMB University of California, San Francisco Genentech Hall Auditorium, Mission Bay, UCSF October 23, 2004 1 Case Study: Van t
More informationIntroduction to LOH and Allele Specific Copy Number User Forum
Introduction to LOH and Allele Specific Copy Number User Forum Jonathan Gerstenhaber Introduction to LOH and ASCN User Forum Contents 1. Loss of heterozygosity Analysis procedure Types of baselines 2.
More informationComputer Science, Biology, and Biomedical Informatics (CoSBBI) Outline. Molecular Biology of Cancer AND. Goals/Expectations. David Boone 7/1/2015
Goals/Expectations Computer Science, Biology, and Biomedical (CoSBBI) We want to excite you about the world of computer science, biology, and biomedical informatics. Experience what it is like to be a
More informationBiostatistical modelling in genomics for clinical cancer studies
This work was supported by Entente Cordiale Cancer Research Bursaries Biostatistical modelling in genomics for clinical cancer studies Philippe Broët JE 2492 Faculté de Médecine Paris-Sud In collaboration
More informationHarvard University. A Pseudolikelihood Approach for Simultaneous Analysis of Array Comparative Genomic Hybridizations (acgh)
Harvard University Harvard University Biostatistics Working Paper Series Year 2005 Paper 30 A Pseudolikelihood Approach for Simultaneous Analysis of Array Comparative Genomic Hybridizations (acgh) David
More informationGenome-wide copy-number calling (CNAs not CNVs!) Dr Geoff Macintyre
Genome-wide copy-number calling (CNAs not CNVs!) Dr Geoff Macintyre Structural variation (SVs) Copy-number variations C Deletion A B C Balanced rearrangements A B A B C B A C Duplication Inversion Causes
More informationDetection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit
APPLICATION NOTE Ion PGM System Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit Key findings The Ion PGM System, in concert with the Ion ReproSeq PGS View Kit and Ion Reporter
More informationComparison of discrimination methods for the classification of tumors using gene expression data
Comparison of discrimination methods for the classification of tumors using gene expression data Sandrine Dudoit, Jane Fridlyand 2 and Terry Speed 2,. Mathematical Sciences Research Institute, Berkeley
More informationComparison of segmentation methods in cancer samples
fig/logolille2. Comparison of segmentation methods in cancer samples Morgane Pierre-Jean, Guillem Rigaill, Pierre Neuvial Laboratoire Statistique et Génome Université d Évry Val d Éssonne UMR CNRS 8071
More informationSection D: The Molecular Biology of Cancer
CHAPTER 19 THE ORGANIZATION AND CONTROL OF EUKARYOTIC GENOMES Section D: The Molecular Biology of Cancer 1. Cancer results from genetic changes that affect the cell cycle 2. Oncogene proteins and faulty
More informationNature Methods: doi: /nmeth.3115
Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data. RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by
More informationGenomic structural variation
Genomic structural variation Mario Cáceres The new genomic variation DNA sequence differs across individuals much more than researchers had suspected through structural changes A huge amount of structural
More informationGenetic alterations of histone lysine methyltransferases and their significance in breast cancer
Genetic alterations of histone lysine methyltransferases and their significance in breast cancer Supplementary Materials and Methods Phylogenetic tree of the HMT superfamily The phylogeny outlined in the
More informationCost effective, computer-aided analytical performance evaluation of chromosomal microarrays for clinical laboratories
University of Iowa Iowa Research Online Theses and Dissertations Summer 2012 Cost effective, computer-aided analytical performance evaluation of chromosomal microarrays for clinical laboratories Corey
More informationHuman Cancer Genome Project. Bioinformatics/Genomics of Cancer:
Bioinformatics/Genomics of Cancer: Professor of Computer Science, Mathematics and Cell Biology Courant Institute, NYU School of Medicine, Tata Institute of Fundamental Research, and Mt. Sinai School of
More informationModule 3: Pathway and Drug Development
Module 3: Pathway and Drug Development Table of Contents 1.1 Getting Started... 6 1.2 Identifying a Dasatinib sensitive cancer signature... 7 1.2.1 Identifying and validating a Dasatinib Signature... 7
More informationAspects of Statistical Modelling & Data Analysis in Gene Expression Genomics. Mike West Duke University
Aspects of Statistical Modelling & Data Analysis in Gene Expression Genomics Mike West Duke University Papers, software, many links: www.isds.duke.edu/~mw ABS04 web site: Lecture slides, stats notes, papers,
More informationWhole-genome detection of disease-associated deletions or excess homozygosity in a case control study of rheumatoid arthritis
HMG Advance Access published December 21, 2012 Human Molecular Genetics, 2012 1 13 doi:10.1093/hmg/dds512 Whole-genome detection of disease-associated deletions or excess homozygosity in a case control
More informationAssociation for Molecular Pathology Promoting Clinical Practice, Basic Research, and Education in Molecular Pathology
Association for Molecular Pathology Promoting Clinical Practice, Basic Research, and Education in Molecular Pathology 9650 Rockville Pike, Bethesda, Maryland 20814 Tel: 301-634-7939 Fax: 301-634-7990 Email:
More informationCNV Detection and Interpretation in Genomic Data
CNV Detection and Interpretation in Genomic Data Benjamin W. Darbro, M.D., Ph.D. Assistant Professor of Pediatrics Director of the Shivanand R. Patil Cytogenetics and Molecular Laboratory Overview What
More informationNature Genetics: doi: /ng Supplementary Figure 1. Mutational signatures in BCC compared to melanoma.
Supplementary Figure 1 Mutational signatures in BCC compared to melanoma. (a) The effect of transcription-coupled repair as a function of gene expression in BCC. Tumor type specific gene expression levels
More informationSession 4 Rebecca Poulos
The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 28
More informationIdentification of regions with common copy-number variations using SNP array
Identification of regions with common copy-number variations using SNP array Agus Salim Epidemiology and Public Health National University of Singapore Copy Number Variation (CNV) Copy number alteration
More informationAnalysis of acgh data: statistical models and computational challenges
: statistical models and computational challenges Ramón Díaz-Uriarte 2007-02-13 Díaz-Uriarte, R. acgh analysis: models and computation 2007-02-13 1 / 38 Outline 1 Introduction Alternative approaches What
More informationSession 4 Rebecca Poulos
The Cancer Genome Atlas (TCGA) & International Cancer Genome Consortium (ICGC) Session 4 Rebecca Poulos Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 20
More informationGenerating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University
Role of Chemical lexposure in Generating Spontaneous Copy Number Variants (CNVs) Jennifer Freeman Assistant Professor of Toxicology School of Health Sciences Purdue University CNV Discovery Reference Genetic
More informationSubLasso:a feature selection and classification R package with a. fixed feature subset
SubLasso:a feature selection and classification R package with a fixed feature subset Youxi Luo,3,*, Qinghan Meng,2,*, Ruiquan Ge,2, Guoqin Mai, Jikui Liu, Fengfeng Zhou,#. Shenzhen Institutes of Advanced
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More informationIdentification of Tissue Independent Cancer Driver Genes
Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important
More informationContents. 1.5 GOPredict is robust to changes in study sets... 5
Supplementary documentation for Data integration to prioritize drugs using genomics and curated data Riku Louhimo, Marko Laakso, Denis Belitskin, Juha Klefström, Rainer Lehtonen and Sampsa Hautaniemi Faculty
More informationRisk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach
Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach Manuela Zucknick Division of Biostatistics, German Cancer Research Center Biometry Workshop,
More informationSupplementary note: Comparison of deletion variants identified in this study and four earlier studies
Supplementary note: Comparison of deletion variants identified in this study and four earlier studies Here we compare the results of this study to potentially overlapping results from four earlier studies
More informationSUPPLEMENTARY APPENDIX
SUPPLEMENTARY APPENDIX 1) Supplemental Figure 1. Histopathologic Characteristics of the Tumors in the Discovery Cohort 2) Supplemental Figure 2. Incorporation of Normal Epidermal Melanocytic Signature
More informationTutorial: acgh Data Analysis With Chipster
Tutorial: acgh Data Analysis With Chipster Ilari Scheinin (firstname.lastname@gmail.com) January 14, 2011 Abstract This tutorial covers analysis of array comparative genomic hybridization (acgh) data with
More informationS1 Appendix: Figs A G and Table A. b Normal Generalized Fraction 0.075
Aiello & Alter (216) PLoS One vol. 11 no. 1 e164546 S1 Appendix A-1 S1 Appendix: Figs A G and Table A a Tumor Generalized Fraction b Normal Generalized Fraction.25.5.75.25.5.75 1 53 4 59 2 58 8 57 3 48
More informationBoosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer
Boosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer Pei Wang Department of Statistics Stanford University Stanford, CA 94305 wp57@stanford.edu Young Kim, Jonathan Pollack Department
More informationSupplementary Figure 1
Supplementary Figure 1 Supplementary Fig. 1: Quality assessment of formalin-fixed paraffin-embedded (FFPE)-derived DNA and nuclei. (a) Multiplex PCR analysis of unrepaired and repaired bulk FFPE gdna from
More informationNovember 9, Johns Hopkins School of Medicine, Baltimore, MD,
Fast detection of de-novo copy number variants from case-parent SNP arrays identifies a deletion on chromosome 7p14.1 associated with non-syndromic isolated cleft lip/palate Samuel G. Younkin 1, Robert
More informationDiscovery of Novel Human Gene Regulatory Modules from Gene Co-expression and
Discovery of Novel Human Gene Regulatory Modules from Gene Co-expression and Promoter Motif Analysis Shisong Ma 1,2*, Michael Snyder 3, and Savithramma P Dinesh-Kumar 2* 1 School of Life Sciences, University
More informationof TERT, MLL4, CCNE1, SENP5, and ROCK1 on tumor development were discussed.
Supplementary Note The potential association and implications of HBV integration at known and putative cancer genes of TERT, MLL4, CCNE1, SENP5, and ROCK1 on tumor development were discussed. Human telomerase
More informationNature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from
Supplementary Figure 1 SEER data for male and female cancer incidence from 1975 2013. (a,b) Incidence rates of oral cavity and pharynx cancer (a) and leukemia (b) are plotted, grouped by males (blue),
More informationNature Biotechnology: doi: /nbt.1904
Supplementary Information Comparison between assembly-based SV calls and array CGH results Genome-wide array assessment of copy number changes, such as array comparative genomic hybridization (acgh), is
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 10, Issue 1 2011 Article 52 Modeling Read Counts for CNV Detection in Exome Sequencing Data Michael I. Love, Max Planck Institute for Molecular
More informationDepartment of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK 2
Advances in Bioinformatics Volume 22, Article ID 876976, 2 pages doi:.55/22/876976 Research Article A High-Throughput Computational Framework for Identifying Significant Copy Number Aberrations from Array
More informationGene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering
Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene
More informationFalse Discovery Rates and Copy Number Variation. Bradley Efron and Nancy Zhang Stanford University
False Discovery Rates and Copy Number Variation Bradley Efron and Nancy Zhang Stanford University Three Statistical Centuries 19th (Quetelet) Huge data sets, simple questions 20th (Fisher, Neyman, Hotelling,...
More informationChapter 4 Cellular Oncogenes ~ 4.6 -
Chapter 4 Cellular Oncogenes - 4.2 ~ 4.6 - Many retroviruses carrying oncogenes have been found in chickens and mice However, attempts undertaken during the 1970s to isolate viruses from most types of
More informationCHROMOSOMAL MICROARRAY (CGH+SNP)
Chromosome imbalances are a significant cause of developmental delay, mental retardation, autism spectrum disorders, dysmorphic features and/or birth defects. The imbalance of genetic material may be due
More informationThe 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis
The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis Tieliu Shi tlshi@bio.ecnu.edu.cn The Center for bioinformatics
More informationAgilent s Copy Number Variation (CNV) Portfolio
Technical Overview Agilent s Copy Number Variation (CNV) Portfolio Abstract Copy Number Variation (CNV) is now recognized as a prevalent form of structural variation in the genome contributing to human
More informationDNA-seq Bioinformatics Analysis: Copy Number Variation
DNA-seq Bioinformatics Analysis: Copy Number Variation Elodie Girard elodie.girard@curie.fr U900 institut Curie, INSERM, Mines ParisTech, PSL Research University Paris, France NGS Applications 5C HiC DNA-seq
More informationGlobal variation in copy number in the human genome
Vol 3 November doi:.38/nature39 Global variation in copy number in the human genome Richard Redon, Shumpei Ishikawa,3, Karen R. Fitch, Lars Feuk,, George H. Perry 7, T. Daniel Andrews, Heike Fiegler, Michael
More informationHALLA KABAT * Outreach Program, mircore, 2929 Plymouth Rd. Ann Arbor, MI 48105, USA LEO TUNKLE *
CERNA SEARCH METHOD IDENTIFIED A MET-ACTIVATED SUBGROUP AMONG EGFR DNA AMPLIFIED LUNG ADENOCARCINOMA PATIENTS HALLA KABAT * Outreach Program, mircore, 2929 Plymouth Rd. Ann Arbor, MI 48105, USA Email:
More informationResults and Discussion of Receptor Tyrosine Kinase. Activation
Results and Discussion of Receptor Tyrosine Kinase Activation To demonstrate the contribution which RCytoscape s molecular maps can make to biological understanding via exploratory data analysis, we here
More informationGenomic complexity and arrays in CLL. Gian Matteo Rigolin, MD, PhD St. Anna University Hospital Ferrara, Italy
Genomic complexity and arrays in CLL Gian Matteo Rigolin, MD, PhD St. Anna University Hospital Ferrara, Italy Clinical relevance of genomic complexity (GC) in CLL GC has been identified as a critical negative
More informationChallenges of CGH array testing in children with developmental delay. Dr Sally Davies 17 th September 2014
Challenges of CGH array testing in children with developmental delay Dr Sally Davies 17 th September 2014 CGH array What is CGH array? Understanding the test Benefits Results to expect Consent issues Ethical
More informationCanadian College of Medical Geneticists (CCMG) Cytogenetics Examination. May 4, 2010
Canadian College of Medical Geneticists (CCMG) Cytogenetics Examination May 4, 2010 Examination Length = 3 hours Total Marks = 100 (7 questions) Total Pages = 8 (including cover sheet and 2 pages of prints)
More informationA REVIEW OF BIOINFORMATICS APPLICATION IN BREAST CANCER RESEARCH
Journal of Advanced Bioinformatics Applications and Research. Vol 1, Issue 1, June 2010, pp 59-68 A REVIEW OF BIOINFORMATICS APPLICATION IN BREAST CANCER RESEARCH Vidya Vaidya, Shriram Dawkhar Department
More informationR2: web-based genomics analysis and visualization platform
R2: web-based genomics analysis and visualization platform Overview Jan Koster Department of Oncogenomics Academic Medical Center (AMC) UvA, the Netherlands jankoster@amc.uva.nl jankoster@amc.uva.nl 1
More informationLTA Analysis of HapMap Genotype Data
LTA Analysis of HapMap Genotype Data Introduction. This supplement to Global variation in copy number in the human genome, by Redon et al., describes the details of the LTA analysis used to screen HapMap
More informationGlobal variation in copy number in the human genome
Global variation in copy number in the human genome Redon et. al. Nature 444:444-454 (2006) 12.03.2007 Tarmo Puurand Study 270 individuals (HapMap collection) Affymetrix 500K Whole Genome TilePath (WGTP)
More informationSALSA MLPA probemix P315-B1 EGFR
SALSA MLPA probemix P315-B1 EGFR Lot B1-0215 and B1-0112. As compared to the previous A1 version (lot 0208), two mutation-specific probes for the EGFR mutations L858R and T709M as well as one additional
More informationAVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB
Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB Analysis Kits Next-generation performance in liquid biopsies 2 Accelerating clinical research From liquid biopsy to next-generation
More informationIntroduction. Cancer Biology. Tumor-suppressor genes. Proto-oncogenes. DNA stability genes. Mechanisms of carcinogenesis.
Cancer Biology Chapter 18 Eric J. Hall., Amato Giaccia, Radiobiology for the Radiologist Introduction Tissue homeostasis depends on the regulated cell division and self-elimination (programmed cell death)
More informationAn Overview of Cytogenetics. Bridget Herschap, M.D. 9/23/2013
An Overview of Cytogenetics Bridget Herschap, M.D. 9/23/2013 Objectives } History and Introduction of Cytogenetics } Overview of Current Techniques } Common cytogenetic tests and their clinical application
More informationCNV detection. Introduction and detection in NGS data. G. Demidov 1,2. NGSchool2016. Centre for Genomic Regulation. CNV detection. G.
Introduction and detection in NGS data 1,2 1 Genomic and Epigenomic Variation in Disease group, Centre for Genomic Regulation 2 Universitat Pompeu Fabra NGSchool2016 methods: methods Outline methods: methods
More informationComputational Analysis of Genome-Wide DNA Copy Number Changes
Computational Analysis of Genome-Wide DNA Copy Number Changes Lei Song Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements
More informationPROCEEDINGS OF SPIE. Models of temporal enhanced ultrasound data for prostate cancer diagnosis: the impact of time-series order
PROCEEDINGS OF SPIE SPIEDigitalLibrary.org/conference-proceedings-of-spie Models of temporal enhanced ultrasound data for prostate cancer diagnosis: the impact of time-series order Layan Nahlawi Caroline
More informationVega: Variational Segmentation for Copy Number Detection
Vega: Variational Segmentation for Copy Number Detection Sandro Morganella Luigi Cerulo Giuseppe Viglietto Michele Ceccarelli Contents 1 Overview 1 2 Installation 1 3 Vega.RData Description 2 4 Run Vega
More informationMultimarker Genetic Analysis Methods for High Throughput Array Data
Multimarker Genetic Analysis Methods for High Throughput Array Data by Iuliana Ionita A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department
More informationDETECTING HIGHLY DIFFERENTIATED COPY-NUMBER VARIANTS FROM POOLED POPULATION SEQUENCING
DETECTING HIGHLY DIFFERENTIATED COPY-NUMBER VARIANTS FROM POOLED POPULATION SEQUENCING DANIEL R. SCHRIDER * Department of Biology and School of Informatics and Computing, Indiana University, 1001 E Third
More informationand SNPs: Understanding Human Structural Variation in Disease. My
CNVs vs. SNPs: Understanding Human Structural Variation in Disease [0:00:00] Hello and welcome to today s Science/AAAS live webinar entitled, CNVs and SNPs: Understanding Human Structural Variation in
More informationInformative Gene Selection for Leukemia Cancer Using Weighted K-Means Clustering
IOSR Journal of Pharmacy and Biological Sciences (IOSR-JPBS) e-issn: 2278-3008, p-issn:2319-7676. Volume 9, Issue 4 Ver. V (Jul -Aug. 2014), PP 12-16 Informative Gene Selection for Leukemia Cancer Using
More information38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16
38 Int'l Conf. Bioinformatics and Computational Biology BIOCOMP'16 PGAR: ASD Candidate Gene Prioritization System Using Expression Patterns Steven Cogill and Liangjiang Wang Department of Genetics and
More informationUsing Network Flow to Bridge the Gap between Genotype and Phenotype. Teresa Przytycka NIH / NLM / NCBI
Using Network Flow to Bridge the Gap between Genotype and Phenotype Teresa Przytycka NIH / NLM / NCBI Journal Wisla (1902) Picture from a local fare in Lublin, Poland Genotypes Phenotypes Journal Wisla
More informationLESSON 3.2 WORKBOOK. How do normal cells become cancer cells? Workbook Lesson 3.2
For a complete list of defined terms, see the Glossary. Transformation the process by which a cell acquires characteristics of a tumor cell. LESSON 3.2 WORKBOOK How do normal cells become cancer cells?
More informationShape-based retrieval of CNV regions in read coverage data. Sangkyun Hong and Jeehee Yoon*
254 Int. J. Data Mining and Bioinformatics, Vol. 9, No. 3, 2014 Shape-based retrieval of CNV regions in read coverage data Sangkyun Hong and Jeehee Yoon* Department of Computer Engineering, Hallym University
More informationVariations in Chromosome Structure & Function. Ch. 8
Variations in Chromosome Structure & Function Ch. 8 1 INTRODUCTION! Genetic variation refers to differences between members of the same species or those of different species Allelic variations are due
More informationAnalysis of CGH and SNP arrays for the detection of chromosomal aberrations in single cells
Analysis of CGH and SNP arrays for the detection of chromosomal aberrations in single cells Peter Konings 1 Evelyne Vanneste 1,2 Thierry Voet 1 Cédric Le Caignec 1 Michèle Ampe 1 Cindy Melotte 1 Sophie
More informationMicroRNA expression profiling and functional analysis in prostate cancer. Marco Folini s.c. Ricerca Traslazionale DOSL
MicroRNA expression profiling and functional analysis in prostate cancer Marco Folini s.c. Ricerca Traslazionale DOSL What are micrornas? For almost three decades, the alteration of protein-coding genes
More informationMicro RNA Research. Ken Kosik. Harriman Professor, Department of Molecular, Cellular & Developmental Biology and Biomolecular Sciences & Engr.
Ken Kosik Harriman Professor, Department of Molecular, Cellular & Developmental Biology and Biomolecular Sciences & Engr. Program Co-Director, Neurosciences Research Institute Micro RNA Research Neuroscience
More informationTCGA. The Cancer Genome Atlas
TCGA The Cancer Genome Atlas TCGA: History and Goal History: Started in 2005 by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) with $110 Million to catalogue
More informationThe Cancer Genome Atlas & International Cancer Genome Consortium
The Cancer Genome Atlas & International Cancer Genome Consortium Session 3 Dr Jason Wong Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW 31 st July 2014 1
More informationGenomic Instability. Kent Nastiuk, PhD Dept. Cancer Genetics Roswell Park Cancer Institute. RPN-530 Oncology for Scientist-I October 18, 2016
Genomic Instability Kent Nastiuk, PhD Dept. Cancer Genetics Roswell Park Cancer Institute RPN-530 Oncology for Scientist-I October 18, 2016 Previous lecturers supplying slides/notes/inspiration Daniel
More informationUnderstanding Genotype- Phenotype relations in Cancer via Network Approaches
AlgoCSB Algorithmic Methods in Computational and Systems Biology Understanding Genotype- Phenotype relations in Cancer via Network Approaches Teresa Przytycka NIH / NLM / NCBI Phenotypes Journal Wisla
More informationFeature Vector Denoising with Prior Network Structures. (with Y. Fan, L. Raphael) NESS 2015, University of Connecticut
Feature Vector Denoising with Prior Network Structures (with Y. Fan, L. Raphael) NESS 2015, University of Connecticut Summary: I. General idea: denoising functions on Euclidean space ---> denoising in
More informationCase Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD
Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD Department of Biomedical Informatics Department of Computer Science and Engineering The Ohio State University Review
More informationCancer gene discovery via network analysis of somatic mutation data. Insuk Lee
Cancer gene discovery via network analysis of somatic mutation data Insuk Lee Cancer is a progressive genetic disorder. Accumulation of somatic mutations cause cancer. For example, in colorectal cancer,
More informationNew Enhancements: GWAS Workflows with SVS
New Enhancements: GWAS Workflows with SVS August 9 th, 2017 Gabe Rudy VP Product & Engineering 20 most promising Biotech Technology Providers Top 10 Analytics Solution Providers Hype Cycle for Life sciences
More informationMultiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016
Multiple Copy Number Variations in a Patient with Developmental Delay ASCLS- March 31, 2016 Marwan Tayeh, PhD, FACMG Director, MMGL Molecular Genetics Assistant Professor of Pediatrics Department of Pediatrics
More informationEpigenetic programming in chronic lymphocytic leukemia
Epigenetic programming in chronic lymphocytic leukemia Christopher Oakes 10 th Canadian CLL Research Meeting September 18-19 th, 2014 Epigenetics and DNA methylation programming in normal and tumor cells:
More information