Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients

Size: px
Start display at page:

Download "Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients"

Transcription

1 Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients Abstract Prognosis for stage IV (metastatic) breast cancer is difficult for clinicians to predict. This study examines the SEER data set from and selects patients who were initially diagnosed with stage IV breast cancer and who have died from a direct result of the cancer. After developing a SEER conversion utility, seer2arff, we create three predictive models that use a supervised, passive, offline technique to classify prognosis (survival time). The results of the algorithms from the Weka toolkit were: Bayes Network, 64.2% accurate; J4.8 Decision Tree, 63.5% accurate; and an Artificial Neural Network, 62.9% accurate. The J4.8 Decision Tree selected attributes that confirm the rationale of ongoing clinical studies. This study is the first to apply machine learning techniques to this category of patients with the SEER data set. Introduction Breast cancer is one of the most common cancers today. Treatment options vary significantly from surgery to chemotherapy depending on many variables like tumor location, size, and patient characteristics. Once diagnosed, physicians attempt to stage or classify the patient s cancer. Stages range from Stage 0 to Stage IV where Stage IV indicates that the cancer has metastasized beyond the breast and local lymph nodes. The staging classification system also helps to predict prognosis. When breast cancer is diagnosed in Stage IV, the five year survivability is 16% whereas if detected early in Stage 1, the percentage is 97%. Through numerous clinical and research efforts, there are well-known classifiers to aid physicians in categorizing a patient s cancer into the appropriate stage. However, once in stage IV, the cancer is already advanced and the stage IV factors affecting prognosis are not as well known. For example in stage IV, 50% of patients have a survivability of two years, however less than 20% of patients survive for four years or more. In this paper we present a comparison of three machine learning methods of predicting survival time for stage IV breast cancer patents from the Surveillance Epidemiology Copyright c 2013, by Joshua Datko. This work is made available under the terms of the Creative Commons Attribution 3.0 Unported License. Joshua Datko Advanced Artificial Intelligence (CS610) Project Proposal Drexel University jbd65@drexel.edu and End Results (SEER) data set(see 2012). SEER is one of the most cited clinical cancer data sets, maintained by the National Cancer Institute and has data as far back as In the SEER data set, survival time is calculated using the date of diagnosis and one of the following: date of death, date last known to be alive, or follow-up cutoff date used for this [data]. This definition is too inclusive and therefore records will be filtered to select patients who were initially diagnosed with stage IV breast cancer and who have died as a direct result of the cancer. Background The machine learning techniques used in this study are supervised, passive, offline algorithms that seek to predict classification. Data mining is a subset of machine learning that attempts to gain insight from data that was previously unknown. In a supervised learning technique, the program is given a series of inputs, x i and outputs, y i and attempts to find a function, f(x) = y. A passive algorithm is one that does not interfere with the data. Offline indicates that the data has been collected versus online where the data is being generated at the time of analysis. Prediction is the process of computing f(x i+1 ) = y i+1. Finally, classification is selecting an output y from a finite set of values. The three algorithms chosen for this study were a J4.8 Decision Tree, Bayes Network, and an Artificial Neural Network (Multi-layer Perceptron). The following sections provides a brief overview on the algorithms. J4.8 Decision Tree The J4.8 decision tree algorithm is Weka s implementation of Ross Quinlan s C4.5 decision tree algorithm. Specifically, it is the implementation of C4.5 revision 8 (Witten, Frank, and Hall 2011). It is a recursive algorithm that seeks to split the data to maximize information gain. Information gain quantifies the insight gained from a particular attribute. The attribute that splits the data into the most distinguishing group has higher information gain than the attribute that results in an uniform distribution. The result of this algorithm is a decision tree that is used to classify future entries. Bayes Network A Bayes Network is a directed acyclic graph that encodes the conditional probability table for each node. Once con-

2 structed, Bayes Networks can answer questions like, given the following attributes, what is the probability being classified as X? To apply the Bayes Network, the network must be first constructed or in the case of machine learning, learned. We use the K2 Bayesian network learner which employs a hill climbing algorithm restricted by an order on the variables. Artificial Neural Network (Multi-layer Perceptron) The multi-layer perceptron emulates the understanding of how neurons in the brain function. In general, a neural network is a collection of nodes that are defined by an activation function, that when exceeded, cause the neuron to fire or enables output on an edge leaving the neuron. In a supervised environment, when given the input and the known output, the network adjusts the activation function for each of the neurons to correspond with the output. Validation Techniques In order to quantify how the model fits the data, several validation mechanism can be evaluated. The one used in this study is a k-fold-cross-validation. In this technique, the data set is split into k subsets and k rounds of learning. Each round uses 1/k of the data as the training set, from which the model is generated. The remaining k 1 subsets are used as test data. The end result is averaged from each of the k rounds. Approach Data Preparation Preparing the data for analysis is a non-trivial and time consuming process. The SEER data set facilitates analysis by independent tools since the data is in ASCII text files. The data set itself is publicly accessible, only after registration and signing of the user agreement on the SEER website. Included in distribution is a thorough data dictionary, detailing each of the 134 attributes (columns). Out of the SEER data set, only the attributes in Table 1 were selected. Some attributes were not selected due to duplicate data. For example in breast cancer, Tumor Marker 1 is the same as ER Status Recode and therefore, the more descriptively named column was selected. Some attributes were not selected due to mutual exclusion. Since AJCC Stage 3 rd Edition, was selected only data from were applicable. Also some data were not selected due to lack of statistical relevance. For our data set, less than 1% of the patients were men. Therefore sex was not included in the machine learning analysis. A query to filter the data set further was developed. This query, whose SEER values are shown in Table 2, is described in natural language as follows: select all records of patients who were initially diagnosed with stage IV breast cancer, and who have died, and who have died as a direct result of the cancer. Selecting based on vital status and cause of death is similar to the query performed in (Bellaachia and Guven 2006), however we further narrowed the data by stage IV only. This reduced the total records to Table 3 shows the breakdown of the query against the total data set. SEER Attribute Marital Status at DX Age at DX Year of DX Grade EOD-Tumor Size EOD-Lymph Node Involv Reason for no surgery Race recode ER Status Recode PR Status Recode AJCC Stage 3 rd ed SEER Cause-Specific Death Classification Vital Status Recode Survival time recode Table 1: List of analyzed attributes ARFF type Numeric Numeric Numeric SEER Attribute Filter by AJCC Stage 3 rd ed SEER Cause-Specific Death Classification 1 Vital Status Recode 4 Table 2: Query to filter the data set Only 1.3% of the available data set is being analyzed, but most of the exclusion is a result of the restriction of the time frame. There are several staging codes used throughout the SEER data set and we chose to use only one (AJCC 3 rd edition) for consistency. This restricted the data to Once filtered, the columns in Table 2 were removed from the analysis since all the records contained the same value. seer2arff A conversion utility, written by the author in Python, called seer2arff was developed to transform the SEER data into ARFF for data processing by the Weka toolkit. The Weka workbench is suite of machine learning algorithms, implemented in Java, and a framework in which one can run multiple data mining experiments, developed by the University of Waikato. The ARFF format described in (Witten, Frank, and Hall 2011), requires tagging attributes by data type; the major types are: numeric, nominal and string. A nominal attribute is one with a discrete number of values. As shown in Table 1, most of the analyzed data is nominal. Survival Time Recode (STR) required conversion to a nominal value. The original data was in a format of YYMM and it was converted to Query Count Percent Total SEER breast cancer patients 657, % Number of patients ( ) 331, % Diagnosed with Stage IV ( ) 12, % Stage IV and have died 11, % Stage IV and have died from the cancer 8, % Table 3: Population selection

3 Survival Time Recode Class Percent of Population Survival Time 1 year % Survival Time > 1 year % Table 4: Survival Time Recode in nominal form a nominal attribute as shown in Table 4. One year was selected because this split produced a reasonably partitioned population. Not all of the SEER data were complete. For example, ER status information was collected for only 59% of all the patients selected. Typically, when data is missing from the SEER data set, it is encoded with 9 or 99. Those instances were replaced with a question mark character, which represents missing data in ARFF. Machine learning experimentation Three Weka data mining classification algorithms were chosen for analysis: J4.8 Decision Tree, Bayesian Network and Multi-layer Perceptron. J4.8 is Weka s implementation of the C4.5 Decision Tree algoirthm, which was run with the default parameters with the exception of the minimum number of objects for a leaf node set to 100. The default setting is two, which creates a much more complex tree. Bayesian Network and Multi-layer Perceptron were both run with their default parameters. To maximize the use of the data set, cross validation with 10 folds was chosen for each algorithm. Evaluation In this section, we show and analyze the results of the three machine learning algorithms. Table 5 summarizes the results of the algorithms. Accuracy refers to the percentage of correctly classified results, across both classes (less than or equal to a year and greater than one year). Precision is defined as the number of true positives for a given class, divided by the true and false positives. Recall is the true positives divided by the number of true positives plus false negatives. Finally, F-Measure is combination of both precision and recall given by the equation: 2P R/(P + R). Overall, the accuracy of the algorithms are within 2% of each other. However, the accuracy is much less than the 86.7% reported in (Bellaachia and Guven 2006) and other research. There are two main factors affecting this discrepancy. The first is that we are only analyzing stage IV patients. In (Bellaachia and Guven 2006), the highest ranked attributes from their J4.8 decision tree were Extension of tumor, Stage of cancer and Lymph node involvement; all three of which are directly correlated to cancer stage. In fact, five of their top ranked attributes are significant factors into determining cancer stage (which is their sixth attribute). Considering these attributes, one should expect greater accuracy. Secondly, their query resulted in 151,886 records, where ours was only 8,726, 94% less. Fortunately (for the patients), a significantly smaller amount of patients are initially diagnosed in stage IV. While more data is not guaranteed to raise accuracy, our models may be improved with more records. Algorithm Acc. Class P R F-Measure BayesNet 64.2% NeuralNet 62.9% J4.8 Tree 63.5% Table 5: Combined Results. (Acc. = Accuracy, P = Precision, R = Recall) Bayes Network Overall, the Bayes Network was the most accurate algorithm of the three. The network produced resulted in each attribute being a child node of survival time. Each node has a probability distribution that can be used to solve for the conditional probability of that attribute. For example, the probability of the patient surviving for greater than one year, given that she had surgery, is 58.6% but the probability of the patient surviving for less than one year, given she had surgery was only 36.7%. This attribute was selected as a distinguishing characteristic in the J4.8 decision tree algorithm, discussed in the following section. The Bayes Net was also had the highest F-Measure for classifying patients who survived for greater than a year. However, the true distribution of patients is slightly weighted to this category as Table 4 shows. J4.8 Decision Tree The J4.8 under-performed the Bayes Network in every area with the exception of recall in the patients greater than one year category, in which it fared marginally better. The resultant decision true produced provides a human-readable model of the classifier. The full tree is shown in Appendix A. The decision tree algorithm uses a recursive approach, selecting the most discriminating attribute (highest information gain) at each level. Multi-layer Perceptron (Neural Network) While the Multi-layer Perceptron had the lowest accuracy, it had the highest recall for class 1 patients. Unfortunately as is generally the case with neural networks, it is difficult to glean any additional understanding from the model. Clinical Analysis The decision tree yielded some interesting observations about the data set. The highest ranked attribute was age at diagnosis, the cutoff of which was 76 years. The mean age at diagnosis is 63 years, with a standard deviation of 14.4 years, which roughly correlates with the 76 year cutoff. This may be intuitive, as older patients are more likely to have other medical complications and experience a greater number of side effects from treatment. Age is a complex factor in this study. While studies have shown that age alone is not a dominant factor to affect prognosis, it is believed that older women are under-treated and are not prescribed as aggressive chemotherapy as younger

4 woman. The J4.8 decision tree only predicted a woman over the age of 76 to live greater than one year was if she had surgery and was ER positive. The next highest ranked attribute, reason for no surgery, is a controversial choice. Retrospective data seems to indicate that surgery is beneficial, however this may be confounded by other factors when the decision to perform surgery is made, such as the patient s performance status and the extent of their metastatic disease. To study this prospectively, there is an ongoing Eastern Cooperative Oncology Group study (NCT ) to determine if breast cancer surgery which removes the primary tumor increases the survival of stage IV patients by randomizing patients to surgery or no surgery. The results of the J4.8 Decision tree, which indicate that surgery is an imporant factor in stage IV prognosis, supports the rationale of this study. ER Status was determined to be the third most important characteristic. ER status represents whether the estrogen receptor is present in the cancer cell by an immunohistochemistry test. When the ER result is positive, different treatment options are available to the patient, specifically hormonal treatments which have less side effects than traditional chemotherapy. From the decision tree, in general, when ER was positive a prediction was made for the patient to survive greater than one year. Marital status at diagnosis was not ranked very high, but interestingly was used as a final tie breaker in one case. On this branch, patients who were married survived greater than one year by almost 2:1. In all other marital categories, widowed, divorced, not married or separated, there were 2:1 odds of dying within one year. This supports the findings in (Osborne et al. 2005), which shows that older unmarried women are at an increased risk of death from breast cancer. Oncologist s Predictions A breast cancer oncologist was consulted and asked what factors she would consider most important in developing a patient s prognosis for a stage IV diagnosis. She listed (not in any particular order): ER status, tumor grade, the extent of the spread of metastases, and patient s performance status. Performance status refers to Karnofsky Performance Status Scale, which quantifies the patients well being. Patients with a rating of 60% and below require some degree of assistance for care and show more symptoms. This data is not available in the SEER data set. However, the spread of metastasis is available (CS Mets at DX) but only on records later than For example, if the breast cancer spreads to the liver or brain, the clinician predicts survival to be much less than if the breast cancer spread to just the bone or the lymph node. Both tumor grade and ER status were selected by the decision tree, however they were not considered as distinguishing as age and reason for no surgery. Related Work and Novelty Several researchers have used the SEER database and Weka to predict survival of breast cancer patients. In (Delen, Walker, and Kadam 2005), three data mining algorithms were tested against the SEER data set and a decision tree approach predicted survivability, defined by the authors as surviving greater than sixty months, with a 93.6% accuracy. (Bellaachia and Guven 2006) performed data mining on the SEER database with the Weka toolkit and showed that the extension of the tumor was the most contributing factor to survivability, followed closely by the stage of the cancer. (Endo, Shibata, and Tanaka 2007) used SEER data from and selected ten independent variables to predict survivability, defined as a surviving greater than sixty months. However, there has not been a study investigating only metastatic breast cancer patients. Furthermore, each study above included the stage of breast cancer in their analysis. Staging is derived information from other empirical patient data and is therefore not an independent factor. Also, the AJCC Staging system is designed as an indicator of prognosis and is an aid for clinicians to development a treatment plan, therefore staging should not be included in machine learning techniques to predict survival. By definition, using stage IV patients is stageless since it is the last stage and it is also the most volatile in terms of survivability. Conclusion While the accuracy of the model was much less than the 80% anticipated in the proposal, we consider this initial research encouraging. Three independent algorithms produced a consistent accuracy estimate and the decision tree confirms important factors in clinical research. In fact, age and reason for no surgery are two factors undergoing current study. To our knowledge this analysis is the first to focus on only stage IV breast cancer patients. Other data mining research has shown that machine learning techniques can confirm, with high accuracy, the prognosis assigned by the staging system. However our research used machine learning to gain insight on a new question, not previously asked. Future Work A similar study can be extended to include patients in the year group 2004 and later. These records would include the extent of the metastases, which may be an important prognosis indicator. The 2004 data also includes the HER2 indicator, which oncologist use in addition to ER to guide treatment options. References Bellaachia, A., and Guven, E Predicting Breast Cancer Survivability Using Data Mining Techniques. Delen, D.; Walker, G.; and Kadam, A Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34(2): Endo, A.; Shibata, T.; and Tanaka, H Comparison of seven algorithms to predict breast cancer survival. International Journal of Biomedical Soft Computing and Human Sciences 13.

5 Osborne, C.; Ostir, G. V.; Du, X.; Peek, M. K.; and Goodwin, J. S The influence of marital status on the stage at diagnosis, treatment, and survival of older women with breast cancer. Breast Cancer Research and Treatment 93(1): Surveillance, epidemiology, and end results (seer) program ( research data ( ). released April 2012, based on the November 2011 submission. Witten, I.; Frank, E.; and Hall, M. A Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.

6 Appendix A: Complete J4.8 Decision Tree age-at-dx <= 76 reason-for-no-surgery = 0: 2 ( / ) reason-for-no-surgery = 1 er-status-recode-breast-cancer = 1 age-at-dx <= 58: 2 (683.28/260.55) age-at-dx > 58 race-recode = 1 grade = 1: 1 (36.48/15.69) grade = 2: 2 (251.76/109.72) grade = 3 marital-status-at-dx = 1: 1 (37.86/14.13) marital-status-at-dx = 2: 2 (218.57/96.5) marital-status-at-dx = 3: 1 (5.57/2.66) marital-status-at-dx = 4: 1 (54.46/23.52) marital-status-at-dx = 5: 1 (111.76/50.33) grade = 4: 1 (37.22/15.54) race-recode = 2: 1 (106.96/41.69) race-recode = 3: 1 (0.68/0.0) race-recode = 4: 1 (50.33/20.48) race-recode = 7: 1 (0.0) er-status-recode-breast-cancer = 2: 1 (764.41/304.12) er-status-recode-breast-cancer = 3: 1 (9.01/2.68) reason-for-no-surgery = 2: 1 (132.13/52.04) reason-for-no-surgery = 6 er-status-recode-breast-cancer = 1: 2 (518.41/229.21) er-status-recode-breast-cancer = 2: 1 (217.76/87.6) er-status-recode-breast-cancer = 3: 2 (10.57/3.16) reason-for-no-surgery = 7: 1 (152.15/66.04) reason-for-no-surgery = 8: 2 (45.04/17.03) age-at-dx > 76 reason-for-no-surgery = 0 er-status-recode-breast-cancer = 1: 2 (485.98/221.5) er-status-recode-breast-cancer = 2: 1 (168.34/55.07) er-status-recode-breast-cancer = 3: 1 (5.86/2.84) reason-for-no-surgery = 1: 1 (644.15/196.38) reason-for-no-surgery = 2: 1 (55.1/7.03) reason-for-no-surgery = 6: 1 (219.39/57.13) reason-for-no-surgery = 7: 1 (104.19/32.06) reason-for-no-surgery = 8: 1 (2.0/1.0) Example: reason-for-no-surgery = 0: 2 ( / ) When reason for no surgery equals 0, (surgery was performed), patients were classified as a 2 (survival time greater than one year) instances were classified correctly, were classified incorrectly. Table 6 contains the descriptions of the codes and Table 4 defines the final survival time recode classifications.

7 SEER Attribute Code Description Reason for no surgery 0 Surgery performed 1 Surgery not recommended 2 Autopsy only case 5 Patient died before recommended surgery 6 Unknown reason for no surgery 7 Patient or patient s guardian refused ER status recode 1 Positive 2 Negative 3 Borderline Race Recode 1 White 2 Black 3 American Indian/Alaska Native 4 Asian or Pacific Islander 7 Other Grade 1 Well differentiated 2 Moderately differentiated 3 Poorly differentiated 4 Anaplastic Marital Status at DX 1 Single (never married) 2 Married (including common law) 3 Separated 4 Divorced 5 Widowed 6 Unmarried or domestic partner Table 6: J4.8 Legend

Predicting Breast Cancer Survivability Rates

Predicting Breast Cancer Survivability Rates Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer

More information

Analysis of Classification Algorithms towards Breast Tissue Data Set

Analysis of Classification Algorithms towards Breast Tissue Data Set Analysis of Classification Algorithms towards Breast Tissue Data Set I. Ravi Assistant Professor, Department of Computer Science, K.R. College of Arts and Science, Kovilpatti, Tamilnadu, India Abstract

More information

Colon cancer survival prediction using ensemble data mining on SEER data

Colon cancer survival prediction using ensemble data mining on SEER data 2013 IEEE International Conference on Big Data Colon cancer survival prediction using ensemble data mining on SEER data Reda Al-Bahrani, Ankit Agrawal, Alok Choudhary Dept. of Electrical Engg. and Computer

More information

Evaluating Classifiers for Disease Gene Discovery

Evaluating Classifiers for Disease Gene Discovery Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics

More information

Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods*

Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods* Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods* 1 st Samuel Li Princeton University Princeton, NJ seli@princeton.edu 2 nd Talayeh Razzaghi New Mexico State University

More information

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT Research Article Bioinformatics International Journal of Pharma and Bio Sciences ISSN 0975-6299 A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS D.UDHAYAKUMARAPANDIAN

More information

Predicting Breast Cancer Recurrence Using Machine Learning Techniques

Predicting Breast Cancer Recurrence Using Machine Learning Techniques Predicting Breast Cancer Recurrence Using Machine Learning Techniques Umesh D R Department of Computer Science & Engineering PESCE, Mandya, Karnataka, India Dr. B Ramachandra Department of Electrical and

More information

Empirical function attribute construction in classification learning

Empirical function attribute construction in classification learning Pre-publication draft of a paper which appeared in the Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence (AI'94), pages 29-36. Singapore: World Scientific Empirical function

More information

Downloaded from ijbd.ir at 19: on Friday March 22nd (Naive Bayes) (Logistic Regression) (Bayes Nets)

Downloaded from ijbd.ir at 19: on Friday March 22nd (Naive Bayes) (Logistic Regression) (Bayes Nets) 1392 7 * :. :... :. :. (Decision Trees) (Artificial Neural Networks/ANNs) (Logistic Regression) (Naive Bayes) (Bayes Nets) (Decision Tree with Naive Bayes) (Support Vector Machine).. 7 :.. :. :.. : lga_77@yahoo.com

More information

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of

More information

The Significance of the Race Factor in Breast Cancer Prognosis

The Significance of the Race Factor in Breast Cancer Prognosis The Significance of the Race Factor in Breast Cancer Prognosis M. Mehdi Owrang O. Department of Computer Science, American University, Washington, D.C, USA Abstract - In this work, we looked at the significance

More information

Data Mining with Weka

Data Mining with Weka Data Mining with Weka Class 2 Lesson 1 Be a classifier! Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Lesson 2.1: Be a classifier! Class 1 Getting started

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

Stage-Specific Predictive Models for Cancer Survivability

Stage-Specific Predictive Models for Cancer Survivability University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations December 2016 Stage-Specific Predictive Models for Cancer Survivability Elham Sagheb Hossein Pour University of Wisconsin-Milwaukee

More information

Classification of Smoking Status: The Case of Turkey

Classification of Smoking Status: The Case of Turkey Classification of Smoking Status: The Case of Turkey Zeynep D. U. Durmuşoğlu Department of Industrial Engineering Gaziantep University Gaziantep, Turkey unutmaz@gantep.edu.tr Pınar Kocabey Çiftçi Department

More information

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017 RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science

More information

Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool

Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool Sujata Joshi Assistant Professor, Dept. of CSE Nitte Meenakshi Institute of Technology Bangalore,

More information

Classification and Predication of Breast Cancer Risk Factors Using Id3

Classification and Predication of Breast Cancer Risk Factors Using Id3 The International Journal Of Engineering And Science (IJES) Volume 5 Issue 11 Pages PP 29-33 2016 ISSN (e): 2319 1813 ISSN (p): 2319 1805 Classification and Predication of Breast Cancer Risk Factors Using

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH 1 VALLURI RISHIKA, M.TECH COMPUTER SCENCE AND SYSTEMS ENGINEERING, ANDHRA UNIVERSITY 2 A. MARY SOWJANYA, Assistant Professor COMPUTER SCENCE

More information

Classification of breast cancer using Wrapper and Naïve Bayes algorithms

Classification of breast cancer using Wrapper and Naïve Bayes algorithms Journal of Physics: Conference Series PAPER OPEN ACCESS Classification of breast cancer using Wrapper and Naïve Bayes algorithms To cite this article: I M D Maysanjaya et al 2018 J. Phys.: Conf. Ser. 1040

More information

A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER

A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER A DATA MINING APPROACH FOR PRECISE DIAGNOSIS OF DENGUE FEVER M.Bhavani 1 and S.Vinod kumar 2 International Journal of Latest Trends in Engineering and Technology Vol.(7)Issue(4), pp.352-359 DOI: http://dx.doi.org/10.21172/1.74.048

More information

CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES California Cancer Reporting System Standards, Volume I

CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES California Cancer Reporting System Standards, Volume I CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES California Cancer Reporting System Standards, Volume I Changes and Clarifications 16 th Edition April 15, 2016 Quick Look- Updates to Volume

More information

Mining Big Data: Breast Cancer Prediction using DT - SVM Hybrid Model

Mining Big Data: Breast Cancer Prediction using DT - SVM Hybrid Model Mining Big Data: Breast Cancer Prediction using DT - SVM Hybrid Model K.Sivakami, Assistant Professor, Department of Computer Application Nadar Saraswathi College of Arts & Science, Theni. Abstract - Breast

More information

SFMC Breast Cancer Site Study: 2011

SFMC Breast Cancer Site Study: 2011 SFMC Breast Cancer Site Study: 2011 Introduction Breast cancer is the most frequently diagnosed cancer among American women, except for skin cancers. It is the second leading cause of cancer death in women,

More information

Genomic Health, Inc. Oncotype DX Colon Cancer Assay Clinical Compendium March 30, 2012

Genomic Health, Inc. Oncotype DX Colon Cancer Assay Clinical Compendium March 30, 2012 Economic Validity Eligibility and Addressability for Use of the Assay An important distinction should be made between the total population of patients eligible for the Oncotype DX Colon Cancer assay, and

More information

Relevance learning for mental disease classification

Relevance learning for mental disease classification Relevance learning for mental disease classification Barbara Hammer 1, Andreas Rechtien 2, Marc Strickert 3, and Thomas Villmann 4 (1) Clausthal University of Technology, Institute of Computer Science,

More information

Chapter 13 Cancer of the Female Breast

Chapter 13 Cancer of the Female Breast Lynn A. Gloeckler Ries and Milton P. Eisner INTRODUCTION This study presents survival analyses for female breast cancer based on 302,763 adult cases from the Surveillance, Epidemiology, and End Results

More information

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS) AUTHORS: Tejas Prahlad INTRODUCTION Acute Respiratory Distress Syndrome (ARDS) is a condition

More information

Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes

Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes Performance Evaluation of Machine Learning Algorithms in the Classification of Parkinson Disease Using Voice Attributes J. Sujatha Research Scholar, Vels University, Assistant Professor, Post Graduate

More information

Assignment Question Paper I

Assignment Question Paper I Subject : - Discrete Mathematics Maximum Marks : 30 1. Define Harmonic Mean (H.M.) of two given numbers relation between A.M.,G.M. &H.M.? 2. How we can represent the set & notation, define types of sets?

More information

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION

CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION 9 CHAPTER 2 MAMMOGRAMS AND COMPUTER AIDED DETECTION 2.1 INTRODUCTION This chapter provides an introduction to mammogram and a description of the computer aided detection methods of mammography. This discussion

More information

A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction

A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction Samuel Giftson Durai Research Scholar, Dept. of CS Bishop Heber College Trichy-17, India S. Hari Ganesh, PhD Assistant

More information

Summary of the BreastScreen Aotearoa Mortality Evaluation

Summary of the BreastScreen Aotearoa Mortality Evaluation Summary of the BreastScreen Aotearoa Mortality Evaluation 1999 2011 Released 2015 nsu.govt.nz Citation: Ministry of Health. 2015. Summary of the BreastScreen Aotearoa Mortality Evaluation 1999 2011. Wellington:

More information

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India 20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision

More information

Racial differences in six major subtypes of melanoma: descriptive epidemiology

Racial differences in six major subtypes of melanoma: descriptive epidemiology Wang et al. BMC Cancer (2016) 16:691 DOI 10.1186/s12885-016-2747-6 RESEARCH ARTICLE Racial differences in six major subtypes of melanoma: descriptive epidemiology Yu Wang 1, Yinjun Zhao 2 and Shuangge

More information

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets Chih-Lin Chi a, W. Nick Street b, William H. Wolberg c a Health Informatics Program, University of Iowa b

More information

Predicting Juvenile Diabetes from Clinical Test Results

Predicting Juvenile Diabetes from Clinical Test Results 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 2006 Predicting Juvenile Diabetes from Clinical Test Results Shibendra Pobi

More information

Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 *

Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 * Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 * Department of CSE, Kurukshetra University, India 1 upasana_jdkps@yahoo.com Abstract : The aim of this

More information

Credal decision trees in noisy domains

Credal decision trees in noisy domains Credal decision trees in noisy domains Carlos J. Mantas and Joaquín Abellán Department of Computer Science and Artificial Intelligence University of Granada, Granada, Spain {cmantas,jabellan}@decsai.ugr.es

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases

More information

Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality

Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality Nadia Helal and Eman Sarwat Radiation Safety Dep. NCNSRC., Atomic Energy Authority, 3, Ahmed El Zomor St., P.Code

More information

4/10/2018. SEER EOD and Summary Stage. Overview KCR 2018 SPRING TRAINING. What is SEER EOD? Ambiguous Terminology General Guidelines

4/10/2018. SEER EOD and Summary Stage. Overview KCR 2018 SPRING TRAINING. What is SEER EOD? Ambiguous Terminology General Guidelines SEER EOD and Summary Stage KCR 2018 SPRING TRAINING Overview What is SEER EOD Ambiguous Terminology General Guidelines EOD Primary Tumor EOD Regional Nodes EOD Mets SEER Summary Stage 2018 Site Specific

More information

SAQ-Adult Probation III & SAQ-Short Form

SAQ-Adult Probation III & SAQ-Short Form * * * SAQ-Adult Probation III & SAQ-Short Form 2002 RESEARCH STUDY This report summarizes SAQ-Adult Probation III (SAQ-AP III) and SAQ- Short Form test data for 17,254 adult offenders. The SAQ-Adult Probation

More information

Multilayer Perceptron Neural Network Classification of Malignant Breast. Mass

Multilayer Perceptron Neural Network Classification of Malignant Breast. Mass Multilayer Perceptron Neural Network Classification of Malignant Breast Mass Joshua Henry 12/15/2017 henry7@wisc.edu Introduction Breast cancer is a very widespread problem; as such, it is likely that

More information

Evaluation of Abstracting: Cancers Diagnosed in MCSS Quality Control Report 2005:2. Elaine N. Collins, M.A., R.H.I.A., C.T.R

Evaluation of Abstracting: Cancers Diagnosed in MCSS Quality Control Report 2005:2. Elaine N. Collins, M.A., R.H.I.A., C.T.R Evaluation of Abstracting: Cancers Diagnosed in 2001 MCSS Quality Control Report 2005:2 Elaine N. Collins, M.A., R.H.I.A., C.T.R Jane E. Braun, M.S., C.T.R John Soler, M.P.H September 2005 Minnesota Department

More information

COMPARISON OF DECISION TREE METHODS FOR BREAST CANCER DIAGNOSIS

COMPARISON OF DECISION TREE METHODS FOR BREAST CANCER DIAGNOSIS COMPARISON OF DECISION TREE METHODS FOR BREAST CANCER DIAGNOSIS Emina Alickovic, Abdulhamit Subasi International Burch University, Faculty of Engineering and Information Technologies Sarajevo, Bosnia and

More information

Surgical Management of Metastatic Colon Cancer: analysis of the Surveillance, Epidemiology and End Results (SEER) database

Surgical Management of Metastatic Colon Cancer: analysis of the Surveillance, Epidemiology and End Results (SEER) database Surgical Management of Metastatic Colon Cancer: analysis of the Surveillance, Epidemiology and End Results (SEER) database Hadi Khan, MD 1, Adam J. Olszewski, MD 2 and Ponnandai S. Somasundar, MD 1 1 Department

More information

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1335-1341 International Research Publications House http://www. irphouse.com A Fuzzy Improved

More information

Breast Cancer Diagnosis and Prognosis

Breast Cancer Diagnosis and Prognosis Breast Cancer Diagnosis and Prognosis Patrick Pantel Department of Computer Science University of Manitoba Winnipeg, Manitoba, Canada R3T 2N2 ppantel@cs.umanitoba.ca Abstract Breast cancer accounts for

More information

FORECASTING TRENDS FOR PROACTIVE CRIME PREVENTION AND DETECTION USING WEKA DATA MINING TOOL-KIT ABSTRACT

FORECASTING TRENDS FOR PROACTIVE CRIME PREVENTION AND DETECTION USING WEKA DATA MINING TOOL-KIT ABSTRACT FORECASTING TRENDS FOR PROACTIVE CRIME PREVENTION AND DETECTION USING WEKA DATA MINING TOOL-KIT Ramesh Singh National Informatics Centre, New Delhi, India Rahul Thukral Department Of Computer Science And

More information

Rajiv Gandhi College of Engineering, Chandrapur

Rajiv Gandhi College of Engineering, Chandrapur Utilization of Data Mining Techniques for Analysis of Breast Cancer Dataset Using R Keerti Yeulkar 1, Dr. Rahila Sheikh 2 1 PG Student, 2 Head of Computer Science and Studies Rajiv Gandhi College of Engineering,

More information

Sociodemographic and Clinical Predictors of Triple Negative Breast Cancer

Sociodemographic and Clinical Predictors of Triple Negative Breast Cancer University of Kentucky UKnowledge Theses and Dissertations--Public Health (M.P.H. & Dr.P.H.) College of Public Health 2017 Sociodemographic and Clinical Predictors of Triple Negative Breast Cancer Madison

More information

Improved Intelligent Classification Technique Based On Support Vector Machines

Improved Intelligent Classification Technique Based On Support Vector Machines Improved Intelligent Classification Technique Based On Support Vector Machines V.Vani Asst.Professor,Department of Computer Science,JJ College of Arts and Science,Pudukkottai. Abstract:An abnormal growth

More information

HEALTH CARE DISPARITIES. Bhuvana Ramaswamy MD MRCP The Ohio State University Comprehensive Cancer Center

HEALTH CARE DISPARITIES. Bhuvana Ramaswamy MD MRCP The Ohio State University Comprehensive Cancer Center HEALTH CARE DISPARITIES Bhuvana Ramaswamy MD MRCP The Ohio State University Comprehensive Cancer Center Goals Understand the epidemiology of breast cancer Understand the broad management of breast cancer

More information

Prediction of Heart Attack risk from Behavioral habits and Demographic variables: An Artificial Neural Network approach

Prediction of Heart Attack risk from Behavioral habits and Demographic variables: An Artificial Neural Network approach International Journal of Engineering Science Invention (IJESI) ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 www.ijesi.org PP. 74-79 Prediction of Heart Attack risk from Behavioral habits and Demographic

More information

Data Mining in Bioinformatics Day 4: Text Mining

Data Mining in Bioinformatics Day 4: Text Mining Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen Karsten Borgwardt: Data Mining in Bioinformatics, Page 1 What is text mining?

More information

I.2 CNExT This section was software specific and deleted in 2008.

I.2 CNExT This section was software specific and deleted in 2008. CANCER REPORTING IN CALIFORNIA: ABSTRACTING AND CODING PROCEDURES FOR HOSPITALS California Cancer Reporting System Standards, Volume I Changes and Clarifications 8th th Edition Revised May 2008 SECTION

More information

A BIOINFORMATIC TOOL FOR BREAST CANCER PREDICTION USING MACHINE LEARNING TECHNIQUES

A BIOINFORMATIC TOOL FOR BREAST CANCER PREDICTION USING MACHINE LEARNING TECHNIQUES International Journal of Computer Engineering and Applications, Volume VII, Issue III, September 14 A BIOINFORMATIC TOOL FOR BREAST CANCER PREDICTION USING MACHINE LEARNING TECHNIQUES Megha Rathi 1, Vikas

More information

DAYS IN PANCREATIC CANCER

DAYS IN PANCREATIC CANCER HOSPITAL AND MEDICAL CARE DAYS IN PANCREATIC CANCER Annals of Surgical Oncology, March 27, 2012 Casey B. Duncan, Kristin M. Sheffield, Daniel W. Branch, Yimei Han, Yong-Fang g Kuo, James S. Goodwin, Taylor

More information

Time-to-Recur Measurements in Breast Cancer Microscopic Disease Instances

Time-to-Recur Measurements in Breast Cancer Microscopic Disease Instances Time-to-Recur Measurements in Breast Cancer Microscopic Disease Instances Ioannis Anagnostopoulos 1, Ilias Maglogiannis 1, Christos Anagnostopoulos 2, Konstantinos Makris 3, Eleftherios Kayafas 3 and Vassili

More information

Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol

Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol Krishnan et al. BMC Cancer (2017) 17:859 DOI 10.1186/s12885-017-3871-7 RESEARCH ARTICLE Mammographic density and risk of breast cancer by tumor characteristics: a casecontrol study Open Access Kavitha

More information

Chapter 5: Epidemiology of MBC Challenges with Population-Based Statistics

Chapter 5: Epidemiology of MBC Challenges with Population-Based Statistics Chapter 5: Epidemiology of MBC Challenges with Population-Based Statistics Musa Mayer 1 1 AdvancedBC.org, Abstract To advocate most effectively for a population of patients, they must be accurately described

More information

This section allows identifying the facility, this information is important for data quality follow up. Source of Standard. Source of Standard

This section allows identifying the facility, this information is important for data quality follow up. Source of Standard. Source of Standard Data Dictionary Case Administration This section allows identifying the facility, this information is important for data quality follow up Facility Name This identifier is needed to evaluate This data

More information

A COMPARITIVE SURVEY ON DATA MINING TECHNIQUES FOR BREAST CANCER DIAGNOSIS AND PREDICTION

A COMPARITIVE SURVEY ON DATA MINING TECHNIQUES FOR BREAST CANCER DIAGNOSIS AND PREDICTION A COMPARITIVE SURVEY ON DATA MINING TECHNIQUES FOR BREAST CANCER DIAGNOSIS AND PREDICTION *Hamid Karim Khani Zand Department of Computer Engineering, Iran University of Science and Technology, Tehran,

More information

Lesson 6 Learning II Anders Lyhne Christensen, D6.05, INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS

Lesson 6 Learning II Anders Lyhne Christensen, D6.05, INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS Lesson 6 Learning II Anders Lyhne Christensen, D6.05, anders.christensen@iscte.pt INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS First: Quick Background in Neural Nets Some of earliest work in neural networks

More information

Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT

Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT Chang Ming, 22.11.2017 University of Basel Swiss Public Health Conference 2017 Breast Cancer & personalized

More information

Probability-Utility Model for Managing Evidence-based Central Database

Probability-Utility Model for Managing Evidence-based Central Database The Open Dentistry Journal, 2010, 4, 61-66 61 Open Access Probability-Utility Model for Managing Evidence-based Central Database Janet G. Bauer* UCLA School of Dentistry, Division of Restorative Dentistry,

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Positive and Unlabeled Relational Classification through Label Frequency Estimation

Positive and Unlabeled Relational Classification through Label Frequency Estimation Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.

More information

Educator Navigation Guide

Educator Navigation Guide Decoding Breast Cancer Virtual Lab Educator Navigation Guide Decoding Cancer Nav Guide 2 Introduction In this virtual lab, students test tissue samples from different patients with breast cancer in order

More information

Automatic Detection of Epileptic Seizures in EEG Using Machine Learning Methods

Automatic Detection of Epileptic Seizures in EEG Using Machine Learning Methods Automatic Detection of Epileptic Seizures in EEG Using Machine Learning Methods Ying-Fang Lai 1 and Hsiu-Sen Chiang 2* 1 Department of Industrial Education, National Taiwan Normal University 162, Heping

More information

Positive and Unlabeled Relational Classification through Label Frequency Estimation

Positive and Unlabeled Relational Classification through Label Frequency Estimation Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.

More information

Requirements for Abstracted Text

Requirements for Abstracted Text Slide 1 Requirements for Abstracted Text Principles of Abstracting Lesson 3: Purpose of Text Slide 2 Available Text Fields Place of Diagnosis Immunotherapy Chemotherapy Hormone Therapy Other Therapy Radiation

More information

Performance Based Evaluation of Various Machine Learning Classification Techniques for Chronic Kidney Disease Diagnosis

Performance Based Evaluation of Various Machine Learning Classification Techniques for Chronic Kidney Disease Diagnosis Performance Based Evaluation of Various Machine Learning Classification Techniques for Chronic Kidney Disease Diagnosis Sahil Sharma Department of Computer Science & IT University Of Jammu Jammu, India

More information

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence To understand the network paradigm also requires examining the history

More information

2014 Oncology Measures Group Overview

2014 Oncology Measures Group Overview 2014 Oncology Measures Group Overview The Oncology Measures Group is a reporting option that significantly reduces the burden of participation in the Physician Quality Reporting System (PQRS). Source:

More information

An Examination of Factors Affecting Incidence and Survival in Respiratory Cancers. Katie Frank Roberto Perez Mentor: Dr. Kate Cowles.

An Examination of Factors Affecting Incidence and Survival in Respiratory Cancers. Katie Frank Roberto Perez Mentor: Dr. Kate Cowles. An Examination of Factors Affecting Incidence and Survival in Respiratory Cancers Katie Frank Roberto Perez Mentor: Dr. Kate Cowles ISIB 2015 University of Iowa College of Public Health July 16th, 2015

More information

Predicting the Effect of Diabetes on Kidney using Classification in Tanagra

Predicting the Effect of Diabetes on Kidney using Classification in Tanagra Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Financial Disclosure. Learning Objectives. Review and Impact of the NCDB PUF. Moderator: Sandra Wong, MD, MS, FACS, FASCO

Financial Disclosure. Learning Objectives. Review and Impact of the NCDB PUF. Moderator: Sandra Wong, MD, MS, FACS, FASCO Review and Impact of the NCDB PUF Moderator: Sandra Wong, MD, MS, FACS, FASCO Financial Disclosure I do not have personal financial relationships with any commercial interests Learning Objectives At the

More information

Modeling Sentiment with Ridge Regression

Modeling Sentiment with Ridge Regression Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,

More information

QUICK-START GUIDE NCDB Participant Use File (NCDB PUF)

QUICK-START GUIDE NCDB Participant Use File (NCDB PUF) QUICK-START GUIDE 2015 NCDB Participant Use File (NCDB PUF) The data included in this zipped file are provided in a flat text file format, and should be read with software such as SAS, SPSS (PASW), STATA,

More information

An Experimental Study of Diabetes Disease Prediction System Using Classification Techniques

An Experimental Study of Diabetes Disease Prediction System Using Classification Techniques IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 1, Ver. IV (Jan.-Feb. 2017), PP 39-44 www.iosrjournals.org An Experimental Study of Diabetes Disease

More information

Creating prognostic systems for cancer patients: A demonstration using breast cancer

Creating prognostic systems for cancer patients: A demonstration using breast cancer Received: 16 April 2018 Revised: 31 May 2018 DOI: 10.1002/cam4.1629 Accepted: 1 June 2018 ORIGINAL RESEARCH Creating prognostic systems for cancer patients: A demonstration using breast cancer Mathew T.

More information

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011 Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011 I. Purpose Drawing from the profile development of the QIBA-fMRI Technical Committee,

More information

International Journal of Software and Web Sciences (IJSWS)

International Journal of Software and Web Sciences (IJSWS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International

More information

The Development and Application of Bayesian Networks Used in Data Mining Under Big Data

The Development and Application of Bayesian Networks Used in Data Mining Under Big Data 2017 International Conference on Arts and Design, Education and Social Sciences (ADESS 2017) ISBN: 978-1-60595-511-7 The Development and Application of Bayesian Networks Used in Data Mining Under Big Data

More information

CS 4365: Artificial Intelligence Recap. Vibhav Gogate

CS 4365: Artificial Intelligence Recap. Vibhav Gogate CS 4365: Artificial Intelligence Recap Vibhav Gogate Exam Topics Search BFS, DFS, UCS, A* (tree and graph) Completeness and Optimality Heuristics: admissibility and consistency CSPs Constraint graphs,

More information

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview Innovative Risk and Quality Solutions for Value-Based Care Company Overview Meet Talix Talix provides risk and quality solutions to help providers, payers and accountable care organizations address the

More information

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer Jayashree Kalpathy-Cramer PhD 1, William Hersh, MD 1, Jong Song Kim, PhD

More information

BACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE

BACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE BACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE NABEEL AL-MILLI Financial and Business Administration and Computer Science Department Zarqa University College Al-Balqa' Applied University

More information

Methods and Limitations Overview

Methods and Limitations Overview Methods and Limitations Overview Geographic Description of Cuyahoga County Cuyahoga County is comprised of 36 neighborhoods within the City of Cleveland and 58 suburban municipalities. To better understand

More information

Cardiac Arrest Prediction to Prevent Code Blue Situation

Cardiac Arrest Prediction to Prevent Code Blue Situation Cardiac Arrest Prediction to Prevent Code Blue Situation Mrs. Vidya Zope 1, Anuj Chanchlani 2, Hitesh Vaswani 3, Shubham Gaikwad 4, Kamal Teckchandani 5 1Assistant Professor, Department of Computer Engineering,

More information

10CS664: PATTERN RECOGNITION QUESTION BANK

10CS664: PATTERN RECOGNITION QUESTION BANK 10CS664: PATTERN RECOGNITION QUESTION BANK Assignments would be handed out in class as well as posted on the class blog for the course. Please solve the problems in the exercises of the prescribed text

More information

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER)

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER) Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER) Yan Wu Advisor: Robert Pruzek Epidemiology and Biostatistics

More information

CANCER PREDICTION SYSTEM USING DATAMINING TECHNIQUES

CANCER PREDICTION SYSTEM USING DATAMINING TECHNIQUES CANCER PREDICTION SYSTEM USING DATAMINING TECHNIQUES K.Arutchelvan 1, Dr.R.Periyasamy 2 1 Programmer (SS), Department of Pharmacy, Annamalai University, Tamilnadu, India 2 Associate Professor, Department

More information

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts jsci2016 Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Wutthipong Kongburan, Praisan Padungweang, Worarat Krathu, Jonathan H. Chan School of Information Technology King

More information

PERFORMANCE EVALUATION USING SUPERVISED LEARNING ALGORITHMS FOR BREAST CANCER DIAGNOSIS

PERFORMANCE EVALUATION USING SUPERVISED LEARNING ALGORITHMS FOR BREAST CANCER DIAGNOSIS PERFORMANCE EVALUATION USING SUPERVISED LEARNING ALGORITHMS FOR BREAST CANCER DIAGNOSIS *1 Ms. Gayathri M, * 2 Mrs. Shahin A. *1 M.Phil Research Scholar, Department of Computer Science, Auxilium College,

More information