Structured Association Advanced Topics in Computa8onal Genomics
|
|
- Mabel Logan
- 5 years ago
- Views:
Transcription
1 Structured Association Advanced Topics in Computa8onal Genomics
2 Structured Association Lasso ACGTTTTACTGTACAATT Gflasso (Kim & Xing, 2009) ACGTTTTACTGTACAATT Greater power Fewer false posi2ves Phenome associa2ons
3 Structured Association Lasso ACGTTTTACTGTACAATT Network- constrained regulariza8on (Li & Li, 2008) ACGTTTTACTGTACAATT
4 Regression with Regularization Fused lasso (Tibshirani et al., 2004)
5 Standard regression Regression with Regularization (Fused Lasso) lasso Fusion penalty only Fused lasso Black line: true values Red line: es8mated values
6 Lasso for Reducing False Positives (Tibshirani, 1996) Trait Genotype Associa8on Strength 2.1 = T G A A C C A T G A A G T A x Lasso Penalty for sparsity argmin (y Xβ) (y Xβ) β + β j Many zero associa8ons (sparse results), but what if there are mul8ple related traits?
7 Multivariate Regression for Multiple-Trait Association Analysis Trait Genotype Associa8on Strength (3.4, 1.5, 2.1, 0.9, 1.8) Allergy Lung physiology = T G A A C C A T G A A G T A x Associa8on strength between SNP j and Trait k: β jk argmin (y Xβ) (y Xβ) β + β j + We introduce graph- guided fusion penalty
8 Multiple-trait Association: Graph-Constrained Fused Lasso Step 1: Thresholded correla8on graph of phenotypes Step 2: Graph- constrained fused lasso ACGTTTTACTGTACAATT Fusion Lasso Penalty Graph- constrained fusion penalty
9 Fusion Penalty SNP j ACGTTTTACTGTACAATT Associa8on strength between SNP j and Trait k: β jk Associa8on strength between SNP j and Trait m: β jm Trait m Trait k Fusion Penalty: β jk - β jm For two correlated traits (connected in the network), the association strengths may have similar values.
10 Graph-Constrained Fused Lasso Overall effect ACGTTTTACTGTACAATT Fusion effect propagates to the entire network Association between SNPs and subnetworks of traits
11 Multiple-trait Association: Graph-Weighted Fused Lasso Overall effect ACGTTTTACTGTACAATT Subnetwork structure is embedded as a densely connected nodes with large edge weights Edges with small weights are effectively ignored
12 Estimating Parameters Quadratic programming formulation Graph-constrained fused lasso Graph-weighted fused lasso Many publicly available software packages for solving convex optimization problems can be used
13 Improving Scalability Original problem Equivalently Using a varia8onal formula8on Itera8ve op8miza8on Update β k Update d jk s, d jml s
14 Simula2on Results 50 SNPs taken from HapMap chromosome 7, CEU population 10 traits SNPs Trait Correla8on Matrix Phenotypes Thresholded Trait Correla8on Network High associa8on No associa8on True Regression Coefficients Single SNP- Single Trait Test Significant at α = 0.01 Lasso Graph- guided Fused Lasso
15 Asthma Trait Network Subnetwork for Asthma symptoms Phenotype Correla8on Network Subnetwork for lung physiology Subnetwork for quality of life
16 Results from Single-SNP/Trait Test Phenotypes Phenotypes Trait Network Lung physiology- related traits I Baseline FEV1 predicted value: MPVLung Pre FEF predicted value Average nitric oxide value: online Body Mass Index Postbronchodila8on FEV1, liters: Spirometry Baseline FEV1 % predicted: Spirometry Baseline predrug FEV1, % predicted Baseline predrug FEV1, % predicted Q551R SNP Codes for amino- acid changes in the intracellular signaling por8on of the receptor Exon 11 SNPs High associa8on No associa8on Single- Marker Single- Trait Test Permuta8on test α = 0.05 Permuta8on test α = 0.01
17 Comparison of Gflasso with Others Phenotypes Phenotypes Trait Network Lung physiology- related traits I Baseline FEV1 predicted value: MPVLung Pre FEF predicted value Average nitric oxide value: online Body Mass Index Postbronchodila8on FEV1, liters: Spirometry Baseline FEV1 % predicted: Spirometry Baseline predrug FEV1, % predicted Baseline predrug FEV1, % predicted Q551R SNP Codes for amino- acid changes in the intracellular signaling por8on of the receptor Exon 11 SNPs? High associa8on No associa8on Single- Marker Single- Trait Test Lasso Graph- guided Fused Lasso
18 Simulation Results
19 Linkage Disequilibrium Structure in IL-4R gene SNP rs SNP rs SNP Q551R r 2 =0.64 r 2 =0.07
20 Bias and Variance Tradeoff The penalty func8on introduces bias to the es8ma8on process, but can reduce the variance The amount of the bias is controlled by selec8ng the appropriate regulariza8on parameter
21 Network-Constrained Regularization for Leveraging Pathway Information (Li and Li, 2008) Pathway databases as prior biological knowledge KEGG, Reactome, BioCarta, BioCyc Leverage the pathway informa8on to detect genes in pathway relevant to the given outcome
22 Graph Laplacian Graph Laplacian: L = D- W Weighted adjacency matrix W: w ij =w ji, w ij =0 if no edges between nodes i and j Degree matrix D: diagonal matrix with diagonal entries Normalized graph Laplacian: Symmetric and posi8ve definite
23 Network-Constrained Regularized Regression Network- constrained regulariza8on criterion Equivalently, If L=I, it becomes elas8c net
24 Optimization Cast it as a lasso op8miza8on problem where
25 Simulation Studies Model: 200 transcrip8on factors, each regula8ng 10 genes four transcrip8on factors and their target genes are relevant to the given response
26 Results from Simulation Study Comparison of lasso, elas8c net, and network- constrained regularized regression
27 Analysis of Glioblastoma Dataset Response: Cancer survival/death Predictors: 1533 genes on 33 KEGG pathways
28 Gene Graph Components Relevant to Cancer Survival
Copy Number Variations and Association Mapping Advanced Topics in Computa8onal Genomics
Copy Number Variations and Association Mapping 02-715 Advanced Topics in Computa8onal Genomics SNP and CNV Genotyping SNP genotyping assumes two copy numbers at each locus (i.e., no CNVs) CNV genotyping
More informationClassifica4on. CSCI1950 Z Computa4onal Methods for Biology Lecture 18. Ben Raphael April 8, hip://cs.brown.edu/courses/csci1950 z/
CSCI1950 Z Computa4onal Methods for Biology Lecture 18 Ben Raphael April 8, 2009 hip://cs.brown.edu/courses/csci1950 z/ Binary classifica,on Given a set of examples (x i, y i ), where y i = + 1, from unknown
More informationRidge regression for risk prediction
Ridge regression for risk prediction with applications to genetic data Erika Cule and Maria De Iorio Imperial College London Department of Epidemiology and Biostatistics School of Public Health May 2012
More informationInference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010
Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics, 2010 C.J.Vaske et al. May 22, 2013 Presented by: Rami Eitan Complex Genomic
More informationPart [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals
Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline
More informationMissing Heritablility How to Analyze Your Own Genome Fall 2013
Missing Heritablility 02-223 How to Analyze Your Own Genome Fall 2013 Heritability Heritability: the propor>on of observed varia>on in a par>cular trait (as height) that can be agributed to inherited gene>c
More informationWhat is Regularization? Example by Sean Owen
What is Regularization? Example by Sean Owen What is Regularization? Name3 Species Size Threat Bo snake small friendly Miley dog small friendly Fifi cat small enemy Muffy cat small friendly Rufus dog large
More informationSupplementary Data. Correlation analysis. Importance of normalizing indices before applying SPCA
Supplementary Data Correlation analysis The correlation matrix R of the m = 25 GV indices calculated for each dataset is reported below (Tables S1 S3). R is an m m symmetric matrix, whose entries r ij
More informationThe Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0
The Loss of Heterozygosity (LOH) Algorithm in Genotyping Console 2.0 Introduction Loss of erozygosity (LOH) represents the loss of allelic differences. The SNP markers on the SNP Array 6.0 can be used
More informationUsing Network Flow to Bridge the Gap between Genotype and Phenotype. Teresa Przytycka NIH / NLM / NCBI
Using Network Flow to Bridge the Gap between Genotype and Phenotype Teresa Przytycka NIH / NLM / NCBI Journal Wisla (1902) Picture from a local fare in Lublin, Poland Genotypes Phenotypes Journal Wisla
More informationVARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA
VARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA by Melissa L. Ziegler B.S. Mathematics, Elizabethtown College, 2000 M.A. Statistics, University of Pittsburgh, 2002 Submitted to the Graduate Faculty
More informationAN INTEGRATIVE COMPUTATIONAL FRAMEWORK FOR DEFINING ASTHMA ENDOTYPES. by J. A. Howrylak, MD
AN INTEGRATIVE COMPUTATIONAL FRAMEWORK FOR DEFINING ASTHMA ENDOTYPES by J. A. Howrylak, MD Submitted to the Graduate Faculty of the University of Pittsburgh School of Medicine, Department of Computational
More informationIdentification of Tissue Independent Cancer Driver Genes
Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important
More informationAnalysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen
More informationCharacteriza*on of Soma*c Muta*ons in Cancer Genomes
Characteriza*on of Soma*c Muta*ons in Cancer Genomes Ben Raphael Department of Computer Science Center for Computa*onal Molecular Biology Soma*c Muta*ons and Cancer Clonal Theory (Nowell 1976) Passenger
More informationMachine Learning to Inform Breast Cancer Post-Recovery Surveillance
Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction
More informationINTEGRATION OF MULTI-PLATFORM HIGH-DIMENSIONAL OMIC DATA
Texas Medical Center Library DigitalCommons@TMC UT GSBS Dissertations and Theses (Open Access) Graduate School of Biomedical Sciences 5-2016 INTEGRATION OF MULTI-PLATFORM HIGH-DIMENSIONAL OMIC DATA Xuebei
More informationStatistical Tests for X Chromosome Association Study. with Simulations. Jian Wang July 10, 2012
Statistical Tests for X Chromosome Association Study with Simulations Jian Wang July 10, 2012 Statistical Tests Zheng G, et al. 2007. Testing association for markers on the X chromosome. Genetic Epidemiology
More informationWhite Paper Estimating Complex Phenotype Prevalence Using Predictive Models
White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015
More informationPrediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer
Prediction and Inference under Competing Risks in High Dimension - An EHR Demonstration Project for Prostate Cancer Ronghui (Lily) Xu Division of Biostatistics and Bioinformatics Department of Family Medicine
More informationNetwork Estimation and Sparsity
Chapter Network Estimation and Sparsity Abstract Network models, in which psychopathological disorders are conceptualized as a complex interplay of psychological and biological components, have become
More informationCS2220 Introduction to Computational Biology
CS2220 Introduction to Computational Biology WEEK 8: GENOME-WIDE ASSOCIATION STUDIES (GWAS) 1 Dr. Mengling FENG Institute for Infocomm Research Massachusetts Institute of Technology mfeng@mit.edu PLANS
More informationIn this module we will cover Correla4on and Validity.
In this module we will cover Correla4on and Validity. A correla4on coefficient is a sta4s4c that is o:en used as an es4mate of measurement, such as validity and reliability. You will learn the strength
More informationComparison of segmentation methods in cancer samples
fig/logolille2. Comparison of segmentation methods in cancer samples Morgane Pierre-Jean, Guillem Rigaill, Pierre Neuvial Laboratoire Statistique et Génome Université d Évry Val d Éssonne UMR CNRS 8071
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationPa#ern recogni,on and neuroimaging in psychiatry
Pa#ern recogni,on and neuroimaging in psychiatry Janaina Mourao-Miranda Machine Learning and Neuroimaging Lab Max Planck UCL Centre for Computa=onal Psychiatry and Ageing Research Outline Supervised learning
More informationTable of content. -Supplementary methods. -Figure S1. -Figure S2. -Figure S3. -Table legend
Table of content -Supplementary methods -Figure S1 -Figure S2 -Figure S3 -Table legend Supplementary methods Yeast two-hybrid bait basal transactivation test Because bait constructs sometimes self-transactivate
More informationSNPrints: Defining SNP signatures for prediction of onset in complex diseases
SNPrints: Defining SNP signatures for prediction of onset in complex diseases Linda Liu, Biomedical Informatics, Stanford University Daniel Newburger, Biomedical Informatics, Stanford University Grace
More informationIntegration of high-throughput biological data
Integration of high-throughput biological data Jean Yang and Vivek Jayaswal School of Mathematics and Statistics University of Sydney Meeting the Challenges of High Dimension: Statistical Methodology,
More informationRisk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach
Risk-prediction modelling in cancer with multiple genomic data sets: a Bayesian variable selection approach Manuela Zucknick Division of Biostatistics, German Cancer Research Center Biometry Workshop,
More informationREGULARIZED MULTIVARIATE REGRESSION FOR IDENTIFYING MASTER PREDICTORS WITH APPLICATION TO INTEGRATIVE GENOMICS STUDY OF BREAST CANCER
The Annals of Applied Statistics 2010, Vol. 4, No. 1, 53 77 DOI: 10.1214/09-AOAS271 Institute of Mathematical Statistics, 2010 REGULARIZED MULTIVARIATE REGRESSION FOR IDENTIFYING MASTER PREDICTORS WITH
More informationCSE 255 Assignment 9
CSE 255 Assignment 9 Alexander Asplund, William Fedus September 25, 2015 1 Introduction In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected
More informationInferring relationships between health and fertility in Norwegian Red cows using recursive models
Corresponding author: Bjørg Heringstad, e-mail: bjorg.heringstad@umb.no Inferring relationships between health and fertility in Norwegian Red cows using recursive models Bjørg Heringstad, 1,2 Xiao-Lin
More informationMul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!)
Mul$ Voxel Pa,ern Analysis (fmri) Mul$ Variate Pa,ern Analysis (more generally) Magic Voxel Pa,ern Analysis (probably not!) all MVPA really shows is that there are places where, in most people s brain,
More informationGraphical Modeling Approaches for Estimating Brain Networks
Graphical Modeling Approaches for Estimating Brain Networks BIOS 516 Suprateek Kundu Department of Biostatistics Emory University. September 28, 2017 Introduction My research focuses on understanding how
More informationAgent-Based Models. Maksudul Alam, Wei Wang
Agent-Based Models Maksudul Alam, Wei Wang Outline Literature Review about Agent-Based model Modeling disease outbreaks in realistic urban social Networks EpiSimdemics: an Efficient Algorithm for Simulating
More informationCase Studies of Signed Networks
Case Studies of Signed Networks Christopher Wang December 10, 2014 Abstract Many studies on signed social networks focus on predicting the different relationships between users. However this prediction
More informationUse and Interpreta,on of LD Score Regression. Brendan Bulik- Sullivan PGC Stat Analysis Call
Use and Interpreta,on of LD Score Regression Brendan Bulik- Sullivan bulik@broadins,tute.org PGC Stat Analysis Call Outline of Talk Intui,on, Theory, Results LD Score regression intercept: dis,nguishing
More informationMethods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes
Methods for meta-analysis of individual participant data from Mendelian randomization studies with binary outcomes Stephen Burgess Simon G. Thompson CRP CHD Genetics Collaboration May 24, 2012 Abstract
More informationHaplotype allelic classes in the lactase persistence locus
Haplotype allelic classes in the lactase persistence locus Robert Cedergren Colloquium november 3 rd 28 Julie Hussin 1,2, Philippe Nadeau 1,2, Jean-François Lefebvre 2 and Damian Labuda 1-3 1 Bioinformatics
More informationThe Late Pretest Problem in Randomized Control Trials of Education Interventions
The Late Pretest Problem in Randomized Control Trials of Education Interventions Peter Z. Schochet ACF Methods Conference, September 2012 In Journal of Educational and Behavioral Statistics, August 2010,
More informationDan Koller, Ph.D. Medical and Molecular Genetics
Design of Genetic Studies Dan Koller, Ph.D. Research Assistant Professor Medical and Molecular Genetics Genetics and Medicine Over the past decade, advances from genetics have permeated medicine Identification
More informationHuman population sub-structure and genetic association studies
Human population sub-structure and genetic association studies Stephanie A. Santorico, Ph.D. Department of Mathematical & Statistical Sciences Stephanie.Santorico@ucdenver.edu Global Similarity Map from
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationBayes Linear Statistics. Theory and Methods
Bayes Linear Statistics Theory and Methods Michael Goldstein and David Wooff Durham University, UK BICENTENNI AL BICENTENNIAL Contents r Preface xvii 1 The Bayes linear approach 1 1.1 Combining beliefs
More informationPathway analysis of bladder cancer genome-wide association study identifies novel pathways involved in bladder cancer development Chen et al
Pathway analysis of bladder cancer genome-wide association study identifies novel pathways involved in bladder cancer development Chen et al Supplementary Table 1: 85 significant pathways from Gen-Gen
More informationGenomewide Linkage of Forced Mid-Expiratory Flow in Chronic Obstructive Pulmonary Disease
ONLINE DATA SUPPLEMENT Genomewide Linkage of Forced Mid-Expiratory Flow in Chronic Obstructive Pulmonary Disease Dawn L. DeMeo, M.D., M.P.H.,Juan C. Celedón, M.D., Dr.P.H., Christoph Lange, John J. Reilly,
More informationPredictive Subnetwork Extraction with Structural Priors for Infant Connectomes
Predictive Subnetwork Extraction with Structural Priors for Infant Connectomes Colin J. Brown 1, Steven P. Miller 2, Brian G. Booth 1, Jill G. Zwicker 3, Ruth E. Grunau 3, Anne R. Synnes 3, Vann Chau 2,
More informationTaking a closer look at trio designs and unscreened controls in the GWAS era
Taking a closer look at trio designs and unscreened controls in the GWAS era PGC Sta8s8cal Analysis Call, November 4th 015 Wouter Peyrot, MD, Psychiatrist in training, PhD candidate Professors Brenda Penninx,
More informationRefining multivariate disease phenotypes for high chip heritability
Sun et al. RESEARCH Refining multivariate disease phenotypes for high chip heritability Jiangwen Sun 1, Henry R. Kranzler 2 and Jinbo Bi 1* * Correspondence: jinbo@engr.uconn.edu 1 Department of Computer
More informationSupplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.
Supplementary Figure 1: Attenuation of association signals after conditioning for the lead SNP. a) attenuation of association signal at the 9p22.32 PCOS locus after conditioning for the lead SNP rs10993397;
More informationSta$s$cs is Easy. Dennis Shasha From a book co- wri7en with Manda Wilson
Sta$s$cs is Easy Dennis Shasha From a book co- wri7en with Manda Wilson Is the Coin Fair? You toss a coin 17 $mes and it comes up heads 15 out of 17 $mes. How likely is it that coin is fair? Could look
More informationQuantitative Trait Analysis in Sibling Pairs. Biostatistics 666
Quantitative Trait Analsis in Sibling Pairs Biostatistics 666 Outline Likelihood function for bivariate data Incorporate genetic kinship coefficients Incorporate IBD probabilities The data Pairs of measurements
More informationTutorial on Genome-Wide Association Studies
Tutorial on Genome-Wide Association Studies Assistant Professor Institute for Computational Biology Department of Epidemiology and Biostatistics Case Western Reserve University Acknowledgements Dana Crawford
More informationLOW-RANK DECOMPOSITION AND LOGISTIC REGRESSION METHODS FOR LINK PREDICTION IN TERRORIST NETWORKS CSE 293 MS PROJECT REPORT, FALL 2010.
LOW-RANK DECOMPOSITION AND LOGISTIC REGRESSION METHODS FOR LINK PREDICTION IN TERRORIST NETWORKS CSE 293 MS PROJECT REPORT, FALL 2010 Eric Doi ekdoi@cs.ucsd.edu University of California, San Diego ABSTRACT
More informationReview: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections
Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi
More informationNetwork-based regularization for high dimensional SNP data in the case control study of Type 2 diabetes
Ren et al. BMC Genetics (2017) 18:44 DOI 10.1186/s12863-017-0495-5 METHODOLOGY TICLE Network-based regularization for high dimensional SNP data in the case control study of Type 2 diabetes Jie Ren 1, Tao
More informationThe impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring
Volume 31 (1), pp. 17 37 http://orion.journals.ac.za ORiON ISSN 0529-191-X 2015 The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression
More informationGene-microRNA network module analysis for ovarian cancer
Gene-microRNA network module analysis for ovarian cancer Shuqin Zhang School of Mathematical Sciences Fudan University Oct. 4, 2016 Outline Introduction Materials and Methods Results Conclusions Introduction
More informationM AXIMUM INGREDIENT LEVEL OPTIMIZATION WORKBOOK
M AXIMUM INGREDIENT LEVEL OPTIMIZATION WORKBOOK for Estimating the Maximum Safe Levels of Feedstuffs Rashed A. Alhotan, Department of Animal Production, King Saud University 1 Dmitry Vedenov, Department
More informationA Ra%onal Perspec%ve on Heuris%cs and Biases. Falk Lieder, Tom Griffiths, & Noah Goodman Computa%onal Cogni%ve Science Lab UC Berkeley
A Ra%onal Perspec%ve on Heuris%cs and Biases Falk Lieder, Tom Griffiths, & Noah Goodman Computa%onal Cogni%ve Science Lab UC Berkeley Outline 1. What is a good heuris%c? How good are the heuris%cs that
More informationMidterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.
Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and
More informationAnale. Seria Informatică. Vol. XVI fasc Annals. Computer Science Series. 16 th Tome 1 st Fasc. 2018
HANDLING MULTICOLLINEARITY; A COMPARATIVE STUDY OF THE PREDICTION PERFORMANCE OF SOME METHODS BASED ON SOME PROBABILITY DISTRIBUTIONS Zakari Y., Yau S. A., Usman U. Department of Mathematics, Usmanu Danfodiyo
More informationAn Introduction to Quantitative Genetics I. Heather A Lawson Advanced Genetics Spring2018
An Introduction to Quantitative Genetics I Heather A Lawson Advanced Genetics Spring2018 Outline What is Quantitative Genetics? Genotypic Values and Genetic Effects Heritability Linkage Disequilibrium
More informationarxiv: v4 [stat.me] 7 May 2010
Submitted to the Annals of Applied Statistics arxiv: 0906.2234 RECONSTRUCTING DNA COPY NUMBER BY PENALIZED ESTIMATION AND IMPUTATION arxiv:0906.2234v4 [stat.me] 7 May 2010 By Zhongyang Zhang, Kenneth Lange,
More informationIdentifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment. Joel Schwartz Harvard TH Chan School of Public Health
Identifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment Joel Schwartz Harvard TH Chan School of Public Health Risk Assessment and Susceptibility Typically we do risk assessments
More informationMultivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University
Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University High dimensional multivariate data, where the number of variables
More informationAssessing Functional Neural Connectivity as an Indicator of Cognitive Performance *
Assessing Functional Neural Connectivity as an Indicator of Cognitive Performance * Brian S. Helfer 1, James R. Williamson 1, Benjamin A. Miller 1, Joseph Perricone 1, Thomas F. Quatieri 1 MIT Lincoln
More informationStructure-Leveraged Methods in Breast Cancer Risk Prediction
Journal of Machine Learning Research 17 (2016) 1-15 Submitted 8/15; Revised 3/16; Published 12/16 Structure-Leveraged Methods in Breast Cancer Risk Prediction Jun Fan junfan@stat.wisc.edu Department of
More informationCNV PCA Search Tutorial
CNV PCA Search Tutorial Release 8.1 Golden Helix, Inc. March 18, 2014 Contents 1. Data Preparation 2 A. Join Log Ratio Data with Phenotype Information.............................. 2 B. Activate only
More informationMultivariable Systems. Lawrence Hubert. July 31, 2011
Multivariable July 31, 2011 Whenever results are presented within a multivariate context, it is important to remember that there is a system present among the variables, and this has a number of implications
More informationIntegrated analysis of mirna/mrna expression and gene methylation using sparse canonical correlation analysis.
University of Louisville ThinkIR: The University of Louisville's Institutional Repository Electronic Theses and Dissertations 5-2016 Integrated analysis of mirna/mrna expression and gene methylation using
More informationInfluence of overweight and obesity on the diabetes in the world on adult people using spatial regression
International Journal of Advances in Intelligent Informatics ISSN: 2442-6571 149 Influence of overweight and obesity on the diabetes in the world on adult people using spatial regression Tuti Purwaningsih
More informationInference of Isoforms from Short Sequence Reads
Inference of Isoforms from Short Sequence Reads Tao Jiang Department of Computer Science and Engineering University of California, Riverside Tsinghua University Joint work with Jianxing Feng and Wei Li
More informationParameter Estimation of Cognitive Attributes using the Crossed Random- Effects Linear Logistic Test Model with PROC GLIMMIX
Paper 1766-2014 Parameter Estimation of Cognitive Attributes using the Crossed Random- Effects Linear Logistic Test Model with PROC GLIMMIX ABSTRACT Chunhua Cao, Yan Wang, Yi-Hsin Chen, Isaac Y. Li University
More informationEvalua&ng Methods. Tandy Warnow
Evalua&ng Methods Tandy Warnow You ve designed a new method! Now what? To evaluate a new method: Establish theore&cal proper&es. Evaluate on data. Compare the new method to other methods. How do you do
More informationAspects of Statistical Modelling & Data Analysis in Gene Expression Genomics. Mike West Duke University
Aspects of Statistical Modelling & Data Analysis in Gene Expression Genomics Mike West Duke University Papers, software, many links: www.isds.duke.edu/~mw ABS04 web site: Lecture slides, stats notes, papers,
More informationDecomposition of the Genotypic Value
Decomposition of the Genotypic Value 1 / 17 Partitioning of Phenotypic Values We introduced the general model of Y = G + E in the first lecture, where Y is the phenotypic value, G is the genotypic value,
More informationLinear and Nonlinear Optimization
Linear and Nonlinear Optimization SECOND EDITION Igor Griva Stephen G. Nash Ariela Sofer George Mason University Fairfax, Virginia Society for Industrial and Applied Mathematics Philadelphia Contents Preface
More informationQuantitative genetics: traits controlled by alleles at many loci
Quantitative genetics: traits controlled by alleles at many loci Human phenotypic adaptations and diseases commonly involve the effects of many genes, each will small effect Quantitative genetics allows
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationWhy and how to make R packages. Bob Muscarella Aarhus University May 7, 2015
Why and how to make R packages Bob Muscarella Aarhus University May 7, 2015 What and how are R packages? Loading the package makes the components of the package available Packages store and organize
More informationGENOME-WIDE ASSOCIATION STUDIES
GENOME-WIDE ASSOCIATION STUDIES SUCCESSES AND PITFALLS IBT 2012 Human Genetics & Molecular Medicine Zané Lombard IDENTIFYING DISEASE GENES??? Nature, 15 Feb 2001 Science, 16 Feb 2001 IDENTIFYING DISEASE
More informationChapter 1. Introduction
Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a
More informationUncovering interactions with Random Forests. Jake Michaelson Marit Ackermann Andreas Beyer
Uncovering interactions with Random Forests Jake Michaelson Marit Ackermann Andreas eyer Random Forests >> ensembles of decision trees >> diverse trees trying to solve the same problem >> used frequently
More informationNature Genetics: doi: /ng Supplementary Figure 1
Supplementary Figure 1 Illustrative example of ptdt using height The expected value of a child s polygenic risk score (PRS) for a trait is the average of maternal and paternal PRS values. For example,
More informationGenome-wide Association Analysis Applied to Asthma-Susceptibility Gene. McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S.
Genome-wide Association Analysis Applied to Asthma-Susceptibility Gene McCaw, Z., Wu, W., Hsiao, S., McKhann, A., Tracy, S. December 17, 2014 1 Introduction Asthma is a chronic respiratory disease affecting
More informationIntroduction of Genome wide Complex Trait Analysis (GCTA) Presenter: Yue Ming Chen Location: Stat Gen Workshop Date: 6/7/2013
Introduction of Genome wide Complex Trait Analysis (GCTA) resenter: ue Ming Chen Location: Stat Gen Workshop Date: 6/7/013 Outline Brief review of quantitative genetics Overview of GCTA Ideas Main functions
More informationStepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality
Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,
More informationStatistical Genetics : Gene Mappin g through Linkag e and Associatio n
Statistical Genetics : Gene Mappin g through Linkag e and Associatio n Benjamin M Neale Manuel AR Ferreira Sarah E Medlan d Danielle Posthuma About the editors List of contributors Preface Acknowledgements
More informationCRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys
Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests
More informationHaplotypes of VKORC1, NQO1 and GGCX, their effect on activity levels of vitamin K-dependent coagulation factors, and the risk of venous thrombosis
Haplotypes of VKORC1, NQO1 and GGCX, their effect on activity levels of vitamin K-dependent coagulation factors, and the risk of venous thrombosis Haplotypes of VKORC1, NQO1 and GGCX, their effect on activity
More informationPrediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE 0, SEPTEMBER 01 ISSN 81 Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors Nabila Al Balushi
More informationUniversity of Groningen. Metabolic risk in people with psychotic disorders Bruins, Jojanneke
University of Groningen Metabolic risk in people with psychotic disorders Bruins, Jojanneke IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from
More informationRegularized Multivariate Regression for Identifying. Master Predictors with Application to Integrative. Genomics Study of Breast Cancer
Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer Jie Peng 1, Ji Zhu 2, Anna Bergamaschi 3, Wonshik Han 4, Dong-Young
More informationImaging Genetics: Heritability, Linkage & Association
Imaging Genetics: Heritability, Linkage & Association David C. Glahn, PhD Olin Neuropsychiatry Research Center & Department of Psychiatry, Yale University July 17, 2011 Memory Activation & APOE ε4 Risk
More informationPerforming. linkage analysis using MERLIN
Performing linkage analysis using MERLIN David Duffy Queensland Institute of Medical Research Brisbane, Australia Overview MERLIN and associated programs Error checking Parametric linkage analysis Nonparametric
More informationA Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data
Method A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data Xiao-Gang Ruan, Jin-Lian Wang*, and Jian-Geng Li Institute of Artificial Intelligence and
More informationDe novo iden)fica)on of SNPs from RNA- seq data in non- model species
De novo iden)fica)on of SNPs from RNA- seq data in non- model species Hélène Lopez- Maestre 8th Novembre 2016 Why work with RNAseq? Lower cost SNPs from expressed regions SNPs with a more direct func:onal
More informationMultiscale factor models for molecular networks
Multiscale factor models for molecular networks Justin Guinney 1,2, Philip Febbo 1,3,4, Mauro Maggioni 5,6, and Sayan Mukherjee 5,6,7 Institute for Genome Sciences & Policy 1, Department of Medicine 2,
More information