Goals/Expectations Computer Science, Biology, and Biomedical (CoSBBI) We want to excite you about the world of computer science, biology, and biomedical informatics. Experience what it is like to be a scientist. train next gen of scientists Answer a question/ask a question Reading papers and Discussing (Journal Clubs) how scientists learn roundtables how scientists learn career prepare you for success regardless of choice cameos professional development pipeline comeback we will pay you. we will guide you through decisions and steps to lead to a career. Year wide involvement letters of recommendation opportunity science fairs! $ scholarships lines on your applications spread the word. DO THE SURVEYS AND ALWAYS TELL US HOW TO IMPROVE KEEP IN CONTACT LinkedIn? Biology of AND David Boone Outline 1
Goals Understand why treatment is difficult. Learn how changes in genome and/or expression can affect proliferation/death. looking at signaling schematic be able to identify what a certain mutation might do. Define oncogenes and tumor suppressors. Define two ways of gathering omics data Be able to analyze heat maps and Kaplan Meier Curve Describe how informatics is impacting personalized medicine. Outline How is different from a bacterial infection? Normal Term used to describe 100s or 1000s of distinct neoplastic disorders caused by your own cells growing out of control. 2
Metastasis Spreading of from one organ to another The many steps of Metastasis occur through clonal selection and the accumulation of different mutations How do we make sense of? Fidler 2003 nature reviews Determine what causes it. What changes occur during the initiation and progression of? How are ous cells different from normal cells? Elucidate which patients will respond to what therapies. Pattern recognition and logic problems Sequencing Biomarkers 3
Outline The Central Dogma of Biology DNA RNA Protein Structure is very important replication transcription/translation info to create proteins 23 chromosomes 3.2 billion base pairs ~20000 genes (1.5% of genome) Central Dogma Important genes in No T s replaced with U s Translation start Translation stop Oncogenes genes that encode for proteins that have the potential to cause. like a gas pedal they make the cells divide more frequently or survive when they shouldn t. turned on by activating mutations, amplifications, and overexpression. ex. c Myc, IGF1R, Ras (growth factors, signaling molecules, transcription factors) 4
Important genes in Survival IGF1R is an important oncogene in BC Can you pick out any other potential oncogenes? Tumor Suppressors genes that encode for proteins that prevent tumor development. like a break pedal they prevent proliferation and initiate cell death if there are problems like DNA damage. They keep proliferation and oncogenes in check. Stop the cell cycle, induce apoptosis, DNA repair, etc. turned off by inactivating mutations, deletions, and lack of expression. ex. p53, BRCA1/2, PTEN, RB too much too much Myc (oncogene) is one of the most frequent amplifications in BC. p53 (tumor suppressor) is one of the most frequently mutated or lost genes in BC. Outline 5
Genetic alterations (All mutations are not equal) Point mutations DNA Frameshift mutation RNA Protein K A * Genetic alterations Genomic alterations Point mutations Duplications or amplifications Point mutations Duplications or amplifications Deletions Myc Myc Myc p53 6
Genomic alterations Point mutations Duplications or amplifications Deletions Insertions/ Inversions Genomic alterations Point mutations Duplications or amplifications Deletions Insertions/ Inversions Translocations Expression alterations outside of the genome Overexpression. ex. Myc (high mitogenic signaling results in high expression of unmutated Myc. Epigenetic silencing ex. methylation Outline/Summary Biology of 7
Outline Now we know a lot about individual genes, but how do we study global genetic or expression changes? Microarrays Hybridization based (complementary base pairing) Comparative Genome Hybridization (CGH) or SNP arrays for known DNA variants and copy number changes. mrna Expression arrays for RNA expression. Next Gen sequencing Sequence based can be used for DNA or RNA can detect mutations, copy number changes, and expression differences. Microarrays Next gen sequencing Used for: 1) Gene expression 2) Copy number changes whole transcript expression array covers 28,869 wellannotated genes with 764,885 distinct probes. affy 6.0 SNP chip 906,000 SNPs and ~ 1,000,000 ADVANTAGES: Relatively cheap Analyze all known genes Analysis is relatively easy DISADVANTAGES Cannot detect novel genes, mutations, or copy number changes Human Genome Project took ~13 years and cost ~3 billion dollars. 3billion bases Now in a few weeks for less than a few thousand dollars, we can sequence a genome. Advantages No limitation on novel detections simultaneously discover expression or copy number changes AND mutations. Disadvantages: expensive analysis is complex and difficult (but not for CoSBBI scholars!) 8
The sequencing boom The Genome Atlas (TCGA) The Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate the understanding of the molecular basis of through the application of genome analysis technologies, including largescale genome sequencing. 33 s 275 million dollars 11000 patients 2700 publications since started in 2006 RNA expression (RNAseq and microarray) Exome sequencing (mutation) Whole genome (for some) Copy number Methylation mirna expression $1000 genome ENCODE (encyclopedia of DNA elements) The human genome project or our generation Build a comprehensive parts list of functional elements in the human genome. Outline 9
Pattern recognition/logic Problems HOW DO WE MAKE SENSE OF IT ALL?!?! Personalized medicine Breast IT WORKS!! Historically classified based on tumor size, nodal involvement, invasion, histology, etc. More recently genomics and transcriptomics have provided clues to what drives tumor initiation and progression and has the potential to divide patients into separate groups that might respond to different therapies. Bayer Sørlie T et al. PNAS 2001;98:10869-10874 10
Gene Expression Patterns Reveal Novel Breast Sub types Gene expression patterns of 85 experimental samples representing 78 carcinomas, three benign tumors, and four normal tissues, analyzed by hierarchical clustering using the 476 cdna intrinsic clone set. Overall and relapse-free survival analysis of the 49 breast patients, uniformly treated in a prospective study, based on different gene expression classification. Personalized medicine WORKS!!!! patient gene Finding Patterns!!! ER ER+ low high basal HER2+ Normal basal like luminal A luminal B Sørlie T et al. PNAS 2001;98:10869-10874 Sørlie T et al. PNAS 2001;98:10869-10874 2001 by National Academy of Sciences Gene expression use in the clinic Mammaprint approved by FDA 70 gene signature for patients with node negative and ER+ tumors compared to conventional classification in ~300 patients 87 would have been treated differently. 67 were determined high risk by conventional methods and were given chemo but by Mammaprint were classified low risk. Followed patients for 10 yrs and Mammaprint was more accurate Oncotype PAM50 All good for deciding chemo vs. endocrine therapy, but still need classifications for ERtumors Summary is different than other diseases because it is really 100s or 1000s of diseases and is your own cells gone bad. Oncogenes cause hyperactive mutation, amplification Tumor Suppressors prevent Prevent cell cycle progression Induce cell death DNA repair lost or turned off mutation, deletion, methylation Different types of alterations Somatic mutation amplification deletion methylation Global analysis Microarray Sequencing Personalized Medicine Biomarker and KM curve 11
1 2 3 4 5 A low high 12