IPA Advanced Training Course October 2013 Academia sinica Gene (Kuan Wen Chen) IPA Certified Analyst
Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises III. Comparison Analyses IV. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease Hands-on Exercises V. Data Analysis & Interpretation in IPA - Case study for association of DNA copy number alterations and ovarian cancer VI. Q&A Proprietary and Confidential 59
Introduction to Data Upload and Analysis Why do we Run an Analysis? Ingenuity s Analyses return The relevant functions associated with the uploaded data Affected signaling and metabolic pathways associated with the uploaded data Networks of interactions among the uploaded molecules as well as related molecules What types of Analyses does IPA have? Core Analysis Tox Analysis Metabolomics Analysis All IPA Analyses return the same information and algorithms. The data is presented in a different order! Proprietary and Confidential 60
Workflow for Dataset Analysis IPA Genomic, exon, mirna, SNP, protein arrays; Any molecule lists; Other proteomic & metabolomic assays Identify functions, diseases, and canonical pathways associated with your data Proprietary and Confidential 61
Key Terminology Observation: An experimental condition such as a time point, disease subtype, or compound concentration Expression Value: Numerical value indicating level of expression, significance, or other assay result for a specific identifier (gene, RNA, protein, or chemical) Reference Set: The set of molecules used as the universe of molecules when calculating the statistical relevance of biological functions and pathways with respect to a dataset file. The set of molecules are the user's dataset or molecules in Ingenuity's Knowledge Base (genes, endogenous chemicals, or both). Focus Molecule: Molecules that are from uploaded list, pass filters are applied, and are available for generating networks Proprietary and Confidential 62
Setting Up a Dataset Identifiers Replicates Average Other observations Proprietary and Confidential 63
Best Practices for Dataset Analysis Calculate averages and p-values for replicate samples outside of IPA Create an Excel spreadsheet One column must have identifiers Up to 20 observations Up to 3 expression value types per observation Only 1 header row Set a cutoff value for each expression value type used For large datasets, use a p-value and another expression value type. Check the number of Molecules Eligible for Network generation Cutoffs depend on the confidence in values, but many use fold change 1.5 and -1.5 and a p-value 0.01 Use the Recalculate button to refresh the screen Proprietary and Confidential 64
Example for Dataset Analysis Observation 1 : Smokers vs. NonSmokers Observation 2 : Early COPD vs. NonSmokers Observation 3 : COPD vs. NonSmokers Proprietary and Confidential 65
After Today, You Should Be Able To: Define the following IPA terms Reference Set Observation Expression Value Network Eligible Molecules Functions/Pathways/Lists Eligible Molecule Focus Molecule Upload a dataset Run an analysis using IPA Best Practices Access the Analysis Summary Page Proprietary and Confidential 66
Live Demo 67
Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises III. Comparison Analyses IV. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease Hands-on Exercises V. Data Analysis & Interpretation in IPA - Case study for association of DNA copy number alterations and ovarian cancer VI. Q&A Proprietary and Confidential 72
Introduction to IPA Functional Analysis Organizes biological information for a high-level overview Provides access to detailed information on functions and the molecules involved Functional Analysis categories contain high quality Gene Ontology information and Ingenuity Expert Findings. Includes: Diseases and Disorders Normal processes in abnormal tissues (apoptosis in tumor cells) Proprietary and Confidential 73
Key Terminology High Level Functions and Categories: Top-level functional annotation categories in Ingenuity's knowledge base. There are currently 85 high level functional categories, in 3 categories: Molecular and Cellular Functions, Physiological System Development and Function, and Diseases and Disorders Function: In Ingenuity s functional ontology, the specific function and associated molecules that participate in that function either from a search result or from an analysis Effect on Function: IPA can also sort the molecules involved in a particular function by the effect they have on that function Proprietary and Confidential 74
Interpret Downstream Biological Functions Identify over-represented biological functions and predict how those functions are increased or decreased in the experiment Proprietary and Confidential 75
Mechanistic Networks How might the upstream molecule drive the observed expression changes? Hypothesis generation and visualization Each hypothesis generated indicates the molecules predicted to be in the signaling cascade Proprietary and Confidential 76
After Today, You Should Be Able To: Define the terms: Focus Molecule, Functional Category, Function, and Effect on Function Access Functional Analysis for a dataset Customize the bar charts according to your preferences Describe Ingenuity s Functional Categorization Describe the significance values calculated for Functions/ Pathways Access Canonical Pathways and Network for a dataset View the molecules involved in a Canonical Pathway and Network View the Canonical Pathway diagram with data overlaid Proprietary and Confidential 79
Live Demo 80
What Does the Z-Score Mean? TR +1-1 +1-1 0 TR: Transcription regulator Literature-based effect TR has downstream genes Predicted activation state of TR: 1: activated (correlated) -1: inhibited (anti-correlated) An absolute z-score of 2 is considered significant. An upstream regulator is predicted to be: Activated if the z-score is 2 Inhibited if the z-score -2 Proprietary and Confidential 87
Upstream Regulators and Mechanistic Networks Upstream Regulator Regulator Dataset Molecules Algorithm chains interacting regulators together to create a Mechanistic Network Additional Upstream Regulators Mechanistic Network Upstream Regulator Dataset Molecules Proprietary and Confidential 88
Ratio vs. Significance Ratio is calculated as the number of genes from your dataset that overlap with the Canonical Pathway in question divided by the total number of genes that are represented in that Canonical pathway It is meant to measure the amount of overlap Significance is calculated in the same way as Functional Analysis It is meant to measure the confidence of overlap Either value is acceptable are and are different ways to look at which pathways are overlapping/represented by your dataset Proprietary and Confidential 89
How Networks Are Generated 1. Focus molecules are seeds 2. Focus molecules with the most interactions to other focus molecules are then connected together to form a network 3. Non-focus molecules from the dataset are then added 4. Molecules from the Ingenuity s Knowledge Base are added 5. Resulting Networks are scored and then sorted based on the score Proprietary and Confidential 96
Hands-on Exercises I 1. Upload a dataset into IPA. You may use your own or we can provide you with an example. 2. What is the top function associated with your dataset? 3. How can you find out what main functions a Canonical Pathway (or group of Canonical Pathways) is involved in? 4. What are the functions of the top network in this analysis? Proprietary and Confidential 98
Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises III. Comparison Analyses IV. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease Hands-on Exercises V. Data Analysis & Interpretation in IPA - Case study for association of DNA copy number alterations and ovarian cancer VI. Q&A Proprietary and Confidential 99
Comparison Analysis allows you to analyze changes in biological states across observations 1. First run a Core, Tox or Metabolomics Analysis on your multiple datasets that represent timepoints or dosage treatments 2. Then use Comparison Analysis to understand which biological processes, clinical pathology endpoints, diseases, and pathways are relevant to each timepoint or dose. Proprietary and Confidential 100
IPA Compare Data Feature 1 Unique to each list List 2 Common across all lists List 3 Compare Union across all lists Proprietary and Confidential 101
Live Demo 102
Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises III. Comparison Analyses IV. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease Hands-on Exercises V. Data Analysis & Interpretation in IPA - Case study for association of DNA copy number alterations and ovarian cancer VI. Q&A Proprietary and Confidential 116
Filter Datasets for Biomarkers or mirna Targets mirna Data mirna Target Filter Molecule Type Pathways (Cancer/ Growth) mrna? 88 data points 13,690 targets 1,090 targets 333 targets 39 targets 32 targets Use Pathway tools to build hypothesis for microrna to mrna target association Proprietary and Confidential 117
Using Biological Context in mirna Target ID Goal: Utilize newly discovered micrornas to better understand biology around potential mrna targets/disease Challenge: New and rapidly evolving field with different measurement techniques and prediction algorithms leading to variability in data Need: Identify mrna targets to micrornas using biological and experimental information, correlate microrna and mrna target expression, specify easy to use confidence levels of interaction predictions, annotate mrna targets with biological context, pathways, species, etc., all within a single workflow Outcome: Reduce time of identification of relevant mrna targets from months to minutes Proprietary and Confidential 118
Tagline (1 Target Scan search) x (each microrna in your data set) = A LOT of targets 119
Live Demo 120
Hands-on Exercises II Overall Exercises: 1. Use the COPD analytical results in exercises I. 2. What is the observed effect on the Xenobiotic Metabolism Signaling Canonical Pathway in the Early COPD group? 3. In the COPD group, focus on the function Cellular Movement. Select these genes and add them to a new My Pathway in your IPA account. How many of the proteins in this pathway are enzymes? 4. In Early COPD vs NonSmokers observation Upstream Regulators chapter, filter the Molecule Type to only Transcription Factors, which molecule is predicted to be Inhibited with the lowest z-score? Proprietary and Confidential 125
Hands-on Exercises II cont. Overall Exercises: 5. In studies of nicotine metabolism in smokers, it has been estimated that 70% of a nicotine dose is metabolized to cotinine. Which group express the highest effect on the Nicotine Degradation pathway? 6. In observation Upstream Regulators chapter. Which molecule is predicted to be activated in Both of early and late COPD groups? Proprietary and Confidential 126
Agenda I. Data Upload and How to Run a Core Analysis II. Functional Interpretation in IPA Hands-on Exercises III. Comparison Analyses IV. Using IPA to Explore microrna Impacts on Molecular Mechanisms of Disease Hands-on Exercises V. Data Analysis & Interpretation in IPA - Case study for association of DNA copy number alterations and ovarian cancer VI. Q&A Proprietary and Confidential 127
Prioritization of Candidate Ovarian Cancer Genes with IPA Survey copy number alterations (CNAs) in ovarian tumors using Affymetrix 500K SNP Chip Profile the expression patterns of these tumors and whole ovary normal samples with Affymetrix U133A and B chips Focus on drivers not passengers eliminate gene expression changes that are not CNA-specific Proprietary and Confidential 128
Goals of study initial study, extended analysis in IPA DNA copy number variations are frequently observed in ovarian cancer. What are the most relevant alterations (recurrent CNAs)? Causal genes in those regions? Which copy number variations are functionally relevant to oncogenesis? Identification of causal genes provides candidate therapeutic targets and biomarkers Help find genes with tumor-driving roles in ovarian cancer What functions are these genes associated with? Can we infer that modulation of those functions may be a major driving factor in the selection of CNAs? Prioritize those candidates based on biomarker, drug target, cellular and disease knowledge Proprietary and Confidential 129
Prioritize Candidate Genes from CNA Study Which are implicated in ovarian cancer? (search Ingenuity Knowledge Base) What molecular interactions exist and do those interactions represent a collective biological function? Potential driver for carcinogenesis? (IPA Core Analysis) Does an assay exist to measure key, carcinogenesis-relevant gene products in a clinical setting? Identify exploratory clinical biomarkers (Overlay Biomarkers) Narrow in on key genes, relationships that are relevant in multiple contexts, datasets MECOM/SMARCA2/CCNE1 Proprietary and Confidential 130
Additional Evidence Linking CNA Genes to Ovarian Cancer Which of them are implicated in ovarian cancer through other studies? What is the evidence? Search for all ovarian cancer genes in the Ingenuity Knowledge Base Visualize those ovarian cancer genes in a pathway Overlay CNA gene expression dataset Focus on ovarian cancer genes whose gene expression is altered based on the CNA study Proprietary and Confidential 131
Validation, Therapeutic Relevance in Ovarian Cancer 13 of the genes identified in this study have been implicated in previous ovarian cancer gene expression studies with similar up/down regulation patterns Several are targets of ovarian cancer drugs: SPINT2 EPCAM PTGER1 PRKC1 CKD4 HDAC10 Proprietary and Confidential 132
Prioritize Candidate Genes from CNA Study Which are implicated in ovarian cancer? (search Ingenuity Knowledge Base) What molecular interactions exist and do those interactions represent a collective biological function? Potential driver for carcinogenesis? (IPA Core Analysis) Does an assay exist to measure key, carcinogenesis-relevant gene products in a clinical setting? Identify exploratory clinical biomarkers (Overlay Biomarkers) Narrow in on key genes, relationships that are relevant in multiple contexts, datasets MECOM/SMARCA2/CCNE1 Proprietary and Confidential 133
Analysis Summary Cell Death, Cell Cycle networks and Functional groups Carcinogenesis and Apoptosis pathways Proprietary and Confidential 135
Top scoring network for CNA-specific gene expression changes Points to key biological process that may be a driver of carcinogenesis in ovarian tissue. Provides a pool of candidate genes for further prioritization, validation (mrna, protein levels, functional validation). Proprietary and Confidential 136
Summary Report for Apoptosis Network Highlights therapeutic relevance, expands biological understanding of network. Proprietary and Confidential 137
Initial Conclusions from Pathway Analysis Collection of expression changes that are specific to ovarian cancer tumors and regions of high copy number alteration Strong association with apoptosis Biological process and genes may be potential drivers of carcinogenesis Contains known anti-neoplastic drug targets: TNFRSF10B, IGF1R, VEGF, MAPK11/12 Proprietary and Confidential 138
Prioritize Candidate Genes from CNA Study Which are implicated in ovarian cancer? (search Ingenuity Knowledge Base) What molecular interactions exist and do those interactions represent a collective biological function? Potential driver for carcinogenesis? (IPA Core Analysis) Does an assay exist to measure key, carcinogenesis-relevant gene products in a clinical setting? Identify exploratory clinical biomarkers (Overlay Biomarkers) Narrow in on key genes, relationships that are relevant in multiple contexts, datasets MECOM/SMARCA2/CCNE1 Proprietary and Confidential 139
What methods, assays exist? Identify exploratory clinical biomarkers Proprietary and Confidential 140
Understand method, application of exploratory biomarker ELISA exists to measure IGFR1 levels and activity in clinical samples. CASP8 is being used as a secondary outcome marker (impact on apoptosis and cell cycle arrest) for CDDO anti-tumor therapy. Proprietary and Confidential 141
Prioritize Candidate Genes from CNA Study Which are implicated in ovarian cancer? (search Ingenuity Knowledge Base) What molecular interactions exist and do those interactions represent a collective biological function? Potential driver for carcinogenesis? (IPA Core Analysis) Does an assay exist to measure key, carcinogenesis-relevant gene products in a clinical setting? Identify exploratory clinical biomarkers (Overlay Biomarkers) Narrow in on key genes, relationships that are relevant in multiple contexts, datasets MECOM/SMARCA2/CCNE1 Proprietary and Confidential 142
Integration of multiple lines of evidence highlights MECOM Upregulated in CNA-specific gene expression analysis of ovarian tumors Upregulated in other ovarian cancer studies (Findings in Ingenuity Knowledge Base) Target of mirna upregulated in ovarian cancer (Dahiya et al, Johns Hopkins ) In a complex with HDAC, which is a target of CML therapeutic intervention Role in CML - repress the TGFβ-mediated growth inhibitory signal Binds SMARCA2 downregulated in CNA-specific gene expression analysis target of mirna downregulated in OC binds ovarian cancer markers Proprietary and Confidential 143
MECOM/SMARCA2 hypothesis CCNE1, SMARCA2, MECOM (EVI1), MBD3 relationship plays an important role in cell proliferation, growth arrest checkpoints. Deregulation of transcript levels and mirna in ovarian tumors suggest important area for validation studies Validate mrna, protein levels & role as drivers of carcinogenic processes Proprietary and Confidential 144
Conclusions Evaluation of Copy Number Alteration (CNA) specific gene expression changes provide valuable insight into genes that are potential drivers of carcinogenesis Analysis of CNA-specific gene expression changes in IPA identified key processes, pathways that may be driven by these genes Apoptosis networks Molecular Mechanisms of Cancer, Death Receptor Signaling pathways Provides an initial pool of candidate genes that may be useful as markers of carcinogenesis in ovarian tissue Examination of candidate genes in the context of multiple lines of evidence narrows in on subset of genes for validation studies: validation of protein levels, functional validation Highlighting exploratory clinical biomarkers, therapeutic targets Overlaying additional mrna, mirna datasets Proprietary and Confidential 145
Thank you Welcome to contact with us Office: +886-2-2795-1777#1169 Fax: +886-2-2793-8009 My E-mail: Genechen@gga.asia MSC Support: msc-support@gga.asia 146