University of Pittsburgh Cancer Institute UPMC CancerCenter. Uma Chandran, MSIS, PhD /21/13

University of Pittsburgh Cancer Institute UPMC CancerCenter Uma Chandran, MSIS, PhD chandran@pitt.edu 412-648-9326 2/21/13

University of Pittsburgh Cancer Institute Founded in 1985 Director Nancy Davidson, MD Leader in breast cancer research 350 clinical and research faculty $174 million in research grants, 12 nationally from NCI 11 CCSG programs 17 Shared resources

Clinical UPCI and UPMC CancerCenter Only NCI designated cancer center in Western Pennsylvania 30 Hub and satellite treatment center 2 in Ireland 1000+ clinical trials

Administration - UPCI Director Clinical Executive advisory committee Research Executive advisory committee Research leaders Shared facility directors External advisory board

Programs Biobehavioral medicine in oncology program Brain tumor program Cancer Epidemiology and Control program Cancer Immunology Program Cancer Virology Program Head and Neck Cancer Program Lung and Thoracic Malignancies Program Melanoma Program Molecular and Cellular Cancer Biology Program Molecular Therapeutics and Drug Discovery Program Prostate Cancer Program

Shared Resources Animal Biobehavioral medicine Biostats Cancer Biomarker Facility Cancer Informatics Cell and Tissue Imaging Cell culture and cytogenetics Chemical Biology Clinical Pharmacology Clinical Research Cytometry Hematopoetic Stem Cell Laboratory Immunologic monitoring and cellular products Laboratory In vivo imaging Investigational drug services Tissue and Reserch Pathology Services Vector Core

Cancer Informatics Service (CIS) Cancer Center Support Grant (CCSG) Michael J Becich, MD, PhD, CIS Core Co-Director Uma Chandran, PhD, Core Co-Director, Bioinformatics Lead

Cancer Informatics Services Michael Becich, MD, PhD Chairman, Department of Biomedical Informatics National leader in Pathology Informatics Developed model for tissue repositories, annotation, honest broker services, research registry services, CPCTR, PCABC, cabig in ICR, vocabulary, catissue, caarray (largest depositors of data to caarray); caties 20+ publications in 2 years

CIS Services Introduction to Services Six Integrated Service Areas: Clinical Trial Management Application (CTMA) Registry Research Info Services (RRIS) Honest Broker Services (HBS) Bioinformatics Service Organ-specific data warehousing Personalized medicine Storage, Archival and Network Services (UPCI IT) 3/31/2011

Personalized Medicine Landscape Biobanking Processing Information models Ontologies Causal discovery Literature Prediction Clinical dec support Data standards Software dev Analytics platforms Data sharing

Earlier Pilots Lessons Learned Clinical and pathologic data reside within siloed databases Lack of structured data; key data often resides within unstructured path reports or clinical notes Lack of harmonization between patient data within the various data sources Multiple bio-bank databases (and spreadsheets) for specimen inventory Multi-disciplinary team is required to understand domain Limited tools available for inspection and analysis of data

What is the purpose of the Data Warehouse? 1) The data warehouse/trc will integrate disparate clinical and omics data to allow integrated analysis - e.g. does mutation of PI3K associate with ER+ breast cancer and outcome? 2) The data warehouse/trc will integrate diagnostic tests, payer information, and outcome to allow cost-effective research of old and new tests - e.g. has integration of OncotypeDx into Pathways resulted in improved patient outcome, less chemotherapy use, and less cost? 3) The EMR and data warehouse/trc will be used to capture prospective clinical information and develop new knowledge about factors affecting patient outcome -e.g. for patients we have treated with neoadjuvant chemotherapy, do biomarkers predict outcome? 4) Learning from HOW we answer 1-3, we can begin to make decisions about the most effective means to collect, integrate, store and analyze this data.

Breast Cancer for the Pilot Project? Pitt/UPMC is the largest contributor of tissue to TCGA approximately 10 % and 143 breast cancer tissues with adjacent normal Leading cancer for personalized care with the first targeted therapy (tamoxifen), predictive biomarkers (ER, HER2) and multi-gene test (Oncotype Dx) Large bio-repository (16,000 treated patients; 4,000 frozen tumors) with associated clinical information Leaders in the area of clinical care and research who are committed to personalized cancer medicine Integrated clinical care (surg onc, med onc, path, rad, rad onc) in a single hospital (Magee-Womens Hospital)

Skin SPORE IT Solution Challenges and Aims Outside consults (patient data not available in UPMC systems) Synoptic data not available for all time frames; synoptic data was not generated for all cases due to limited data received by pathologist Data inconsistencies in source systems (missing key identifies, misspellings, invalid dates) To leverage existing lab registry study data and toolsets such as CTMA and i2b2 Establish integration points between disparate clinical and research applications

Skin SPORE IT Solution Advantages Eliminates rekeying of data and limits duplicate data entry Increase data quality thru single source capture; updates to the source data are capture automatically Provides an integrated view of data from disparate systems

Bioinformatics Microarrays All platforms including Affymetrix, Illumina and Agilent SNPs, microarray, microrna, integrative analysis TMA Proteomic Integrative genomic and proteomic analyses Teaching Workshops on using open source tools to aid investigators in interpreting data and analysis

Bioinformatics utilization 2500 2000 1500 Hours 1000 Prostate Cancer Program All other programs 500 0 2004 2005 2006 2007 2008 2010 2011 Year

Bioinformatics projects mrna expression in breast, ovarian mrna and mirna in melanoma Analyze publicly available GEO datasets Cell line data from DNA repair genes mirna in breast cancer cell lines CN and expression analysis in breast cancer CN analysis in metastatic lung tumors Integrative analysis across TCGA platforms Some NGS with Renal Cell tumors Requirements for Personalized medicine

Bioinformatics Challenges Tremendous increase in request for services Shifting towards analysis of consortia projects such as TCGA, 1000 genomes Next Gen Sequencing Limited resources, lack of training 2 week workshops NGS Personalized Medicine Task force Institute for Personalized Medicine Evaluating campus wide infrastructure for storage, archiving, analysis Pittsburgh Supercomputing Center, Pitt Research Clusters Campus wide licenses for CLC Biosciences installed on Pitt clusters Working with vendors on pipelines KNOME, Genome Oncology Hired and additional FTE

CIS funding 20% CCSG The rest grant supported and institutional Challenging to enforce fee for service model especially for bioinformatics However, investigators do not budget for the true cost of bioinformatics Difficult to add staff Bioinformatics researchers do not want to be involved in core service because competing pressure for career advancement