Heterogeneous Data Mining for Brain Disorder Identification Bokai Cao 04/07/2015
Outline Introduction Tensor Imaging Analysis Brain Network Analysis Davidson et al. Network discovery via constrained tensor analysis of fmri data. In KDD, 2013. Wee et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage, 2012. Multiview Feature Analysis Future Work Ye et al. Heterogeneous data fusion for Alzheimer s disease study. In KDD, 2008. 2
Outline Introduction Tensor Imaging Analysis Brain Network Analysis Multiview Feature Analysis Future Work 3
Introduction Brain Disorders HIV, AD, ADHD What is the data about? Neuroimaging Techniques fmri, DTI, PET, EEG Where does the data come from? Data Representations Tensor, graph, vector What does the data look like? 4
Introduction Brain Disorders HIV infection on brain Bipolar disorder Alzheimer's disease (AD) Attention deficit hyperactivity disorder (ADHD) Schizophrenia 5
Introduction Neuroimaging Techniques Functional magnetic resonance imaging (fmri) Diffusion tensor imaging (DTI) Positron emission tomography (PET) Electroencephalogram (EEG) 6
Introduction Data Representations Tensor: raw images Graph: brain networks Vector: multiview features View 1 View 2 View 3 7
Outline Introduction Tensor Imaging Analysis Brain Network Analysis Multiview Feature Analysis Future Work 8
Tensor Imaging Analysis Tensor Data in Neuroimaging x vector matrix X tensor X A voxel is the smallest threedimensional point volume referenced in a neuroimaging of the brain. Typically, 256*256*256. 1storder tensor 2ndorder tensor 3rdorder tensor 9
Tensor Imaging Analysis Brain Network Discovery 1. Node discovery 2. Edge discovery 3. Network verification Davidson et al. Network discovery via constrained tensor analysis of fmri data. In KDD, 2013. 10
Tensor Imaging Analysis Brain Network Discovery Tensor factorization constraints c R b R fmri data c 1 b1 c 2 b2 X a 1 a 2 a R anatomical structures Davidson et al. Network discovery via constrained tensor analysis of fmri data. In KDD, 2013. 11
Tensor Imaging Analysis Brain Network Discovery Contributions: (1) simultaneously discover nodes and edges (2) facilitate better understanding about interaction mechanism between brain regions Drawback: fail to leverage relationships between neuroimages and their associated labels Davidson et al. Network discovery via constrained tensor analysis of fmri data. In KDD, 2013. 12
Tensor Imaging Analysis Supervised Tensor Learning Classification (review) age? features: <age, weight> label: diseased (positive) or healthy (negative)? weight He et al. DuSK: A dual structurepreserving kernel for supervised tensor learning with applications to neuroimages. In SDM, 2014. 13
Tensor Imaging Analysis Supervised Tensor Learning Tensor classification 1. High dimensionality m=256*256*256=16,777,216 x1 x2 age x3... weight xm He et al. DuSK: A dual structurepreserving kernel for supervised tensor learning with applications to neuroimages. In SDM, 2014. 14
Tensor Imaging Analysis Supervised Tensor Learning Tensor classification 1. High dimensionality 2. Structural complexity He et al. DuSK: A dual structurepreserving kernel for supervised tensor learning with applications to neuroimages. In SDM, 2014. 15
Tensor Imaging Analysis Supervised Tensor Learning Tensor classification 1. High dimensionality 2. Structural complexity 3. Nonlinear separability He et al. DuSK: A dual structurepreserving kernel for supervised tensor learning with applications to neuroimages. In SDM, 2014. 16
Tensor Imaging Analysis Supervised Tensor Learning Support tensor machine 1 n min W,b, 2 kwk2 F C X i i=1 s.t. y i (hw, X i i b) 1 i 0, 8i =1,,n. i primary form max 1,, n s.t. nx i=1 i 1 2 nx i y i =0 i=1 dual form nx i,j=1 0 apple i apple C, 8i =1,,n. i j y i y j h (X i ), (X j )i 17 h X X i He et al. DuSK: A dual structurepreserving kernel for supervised tensor learning with applications to neuroimages. In SDM, 2014.
Tensor Imaging Analysis Tensor Data in Neuroimaging Brain Network Analysis Node discovery, edge discovery, network verification, tensor factorization Supervised Tensor Learning Tensor classification, support tensor machine 18
Outline Introduction Tensor Imaging Analysis Brain Network Analysis Multiview Feature Analysis Future Work 19
Brain Network Analysis Brain Network Data Graph Nodes: brain regions e.g., insula, hippocampus, thalamus fmri links: correlations between the functional activities of brain regions DTI links: number of neural fibers connecting different brain regions 20
Brain Network Analysis Kernel Learning on Graphs? how to?? Wee et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage, 2012. 21
Brain Network Analysis Kernel Learning on Graphs 1. Extracting clustering coefficients 2. Selecting the most discriminative features 3. Constructing kernel matrices 4. Training classifiers Wee et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage, 2012. 22
Brain Network Analysis Kernel Learning on Graphs clustering coefficient of region 1 clustering coefficient of region 2 clustering coefficient of region 3... clustering coefficient of region m classifiers Wee et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage, 2012. 23
Brain Network Analysis Kernel Learning on Graphs Contributions: (1) integrate complementary DTI and fmri (2) achieve 96% classification accuracy (3) select the most discriminative brain regions Drawbacks: (1) connectivity structures are blinded (2) interpretability is limited Wee et al. Identification of MCI individuals using structural and functional connectivity networks. Neuroimage, 2012. 24
Brain Network Analysis Subgraph Pattern Mining Alzheimer's disease Alzheimer's disease Alzheimer's disease Normal Normal Normal A discriminative subgraph pattern Kong et al. Discriminative feature selection for uncertain graph classification. In SDM, 2013. 25
Brain Network Analysis Subgraph Pattern Mining Binary links: {0,1} e.g. gspan [Yan and Han 2002] fmri links: [1,1] DTI links: N (0,1,2,,1e6, ) Kong et al. Discriminative feature selection for uncertain graph classification. In SDM, 2013. 26
Brain Network Analysis Subgraph Pattern Mining Mining uncertain graphs Kong et al. Discriminative feature selection for uncertain graph classification. In SDM, 2013. 27
Brain Network Analysis Brain Network Data Kernel Learning on Graphs Extracting clustering coefficients Subgraph Pattern Mining Mining uncertain graphs 28
Outline Introduction Tensor Imaging Analysis Brain Network Analysis Multiview Feature Analysis Future Work 29
Multiview Feature Analysis Multiview Data in Medical Studies limited subjects available yet introducing a large number of measurements MRI sequence A View 2 View 1 View 6 HIV/ seronegative Cognitive measures View 5 multiview learning feature selection MRI sequence B View 3 View 4 Immunologic measures Serologic measures Clinical measures 30
Multiview Feature Analysis Modeling View Correlations Tensor and AAL features from MRI images Demographic information: age and gender Genetic information Ye et al. Heterogeneous data fusion for Alzheimer s disease study. In KDD, 2008. 31
Multiview Feature Analysis Modeling View Correlations x2 (1) x2 (2) x1 (1) x1 (2) x2 (2) x2 (1) concatenation x1 (2) x1 (1) Ye et al. Heterogeneous data fusion for Alzheimer s disease study. In KDD, 2008. 32
Multiview Feature Analysis Modeling View Correlations x2 (1) x2 (2) x1 (1) x1 (2) α 1 α 2 multikernel learning Ye et al. Heterogeneous data fusion for Alzheimer s disease study. In KDD, 2008. 33
Multiview Feature Analysis Modeling View Correlations Contributions: (1) integrate different types of features (2) identify biomarkers (brain regions) from multiple data sources Drawbacks: fail to explicitly consider correlations between features Ye et al. Heterogeneous data fusion for Alzheimer s disease study. In KDD, 2008. 34
Multiview Feature Analysis Modeling Feature Correlations Vectorbased method View 1 modeling feature selection View 2 View 3 multiview data neglect multiview knowledge information loss accuracy drops Cao et al. Tensorbased multiview feature selection with applications to brain diseases. In ICDM, 2014. 35
Multiview Feature Analysis Modeling Feature Correlations Tensorbased method View 1 modeling feature selection View 2 View 3 multiview data suffer from redundancy and irrelevance noise accuracy drops Cao et al. Tensorbased multiview feature selection with applications to brain diseases. In ICDM, 2014. 36
Multiview Feature Analysis Modeling Feature Correlations Dual method View 1 modeling feature selection View 2 View 3 multiview data Cao et al. Tensorbased multiview feature selection with applications to brain diseases. In ICDM, 2014. 37
Multiview Feature Analysis Modeling Feature Correlations Wrapperbased feature selection Training set Classification algorithm Feature set Hypothesis Feature selection Training set Feature set Test set Classification algorithm Classification evaluation Estimated performance Cao et al. Tensorbased multiview feature selection with applications to brain diseases. In ICDM, 2014. 38
Multiview Feature Analysis Modeling Feature Correlations View 1 x i (1) tensor product support tensor machine feature evaluation View 2 View 3 x i (2) x i (3) X i W iteration X i * feature selection W :,...,:,iv,:,...,: Cao et al. Tensorbased multiview feature selection with applications to brain diseases. In ICDM, 2014. 39
Multiview Feature Analysis Multiview Data in Medical Studies Modeling View Correlations Multikernel method Modeling Feature Correlations Vectorbased method, tensorbased method, dual method 40
Outline Introduction Tensor Imaging Analysis Brain Network Analysis Multiview Feature Analysis Future Work 41
Future Work Learning on Different Data Representations neuroimaging experiments tensor data brain network data heterogeneous data fusion other sources, e.g., clinical and serologic experiments multiview data single modality 42
Future Work Integrating Multiple Imaging Modalities fmri: functional connections multiple thresholds [Jie. et al. 2014], multispectrum [Wee et al. 2012] DTI: structural connections multiple physiological parameters: fiber count, fractional anisotropy (FA), mean diffusivity (MD), and principal diffusivities [Wee et al. 2011] 43
Future Work Mining Bioinformatics Information Networks 44
Future Work BRAIN Initiative http://www.whitehouse.gov/brain President Obama is making over $300 million investments in the BRAIN Initiative to revolutionize our understanding of the human mind and uncover new ways to treat, prevent, and cure brain disorders like Alzheimer s, schizophrenia, autism, epilepsy, and traumatic brain injury. 45
Thank you.!