ASSESSMENT OF ECONOMICAL STABILITY OF PROJECT INVESTORS BY MEANS OF HYBRID TECHNIQUES.
|
|
- Amberly Daniel
- 5 years ago
- Views:
Transcription
1 1 ASSESSMENT OF ECONOMICAL STABILITY OF PROJECT INVESTORS BY MEANS OF HYBRID TECHNIQUES. Mª Teresa Rodríguez*, Villanueva, Joaquin**; Menendez, Cesar*; Alonso, Cristina** *Universidad de Oviedo Departamento de Matemáticas **Universidad de Oviedo Área de Proyectos de Ingeniería C/ Independencia, Oviedo. Tfn.: Fax: mayte@api.uniovi.es RESUMEN La realización de proyectos requiere usualmente de la búsqueda de financiación externa. Las características de los proyectos y las incertidumbres existentes introducen unos riesgos financieros que los bancos deben evaluar con el fin de conceder las financiaciones. En este artículo se presenta la utilización de técnicas de data mining como herramienta de ayuda para evaluar el riesgo de un promotor de entrar en crisis de liquidez. Para ello se ha analizado un conjunto de 1, casos reales proporcionados por una entidad bancaria, entre los que se incluyen un 1% de casos que han entrado en crisis de liquidez económica. De cada caso se ha recogido la información relativa a 26 variables. El fichero ha sido protegido ocultando la identidad de las compañías y el significado de las variables. El fichero de datos ha sido pre-procesado, seleccionando las variables más relevantes mediante la utilización de una estrategia de poda iterativa. El modelo final fue generado con técnicas adaptativas multivariantes proporcionando una valiosa herramienta a los bancos para evaluar el riesgo de crisis de liquidez para la finaciación de los proyectos. Palabras clave: data mining, crisis de liquidez, financiación de proyectos ABSTRACT In general the development of a project implies searching for finance. The characteristics of a project, specifically the risk, introduce an important uncertainty that banks must consider in order to provide finance or not. Promoters getting into a liquidity crisis is one of the major problems banks have to face today. A liquidity crisis occurs whenever a company is unable to pay its bills on time or lacks sufficient cash to expand inventory and production. Banks need to establish and implement prudent liquidity management policies to assess the liquidity of the companies and to protect their positions. The detection of the risk of a liquidity crisis of a company is a hard task, because companies with the best credit don t need the loans, and companies with worst credit are not likely to repay. Bank s best customers are in the middle. In this paper it is shown how to use data mining techniques to predict the risk of a project for getting into a liquidity crisis. An historic dataset relative to 1 cases has been analyzed. 1456
2 The file was protected to warranty the privacy of customers blinding the name of the variables and the name of the companies. For the creation of the model a combination of several data mining techniques was used, which allow identifying the variables more relevant and to extract rules which facilitates the interpretation of data. Final model was created using multivariate adaptative techniques providing an important tool for banks when discriminating their own risk for project finance. Key Words: Data mining, liquidity crisis, project finance 2. INTRODUCTION There are a great number of projects which need to be bank-financed. Financial institutions must analyze the liquidity risk of the partners intervening in the project ensuring that they will have sufficient liquidity to meet liabilities when due, under both normal and stressed conditions. In general it is a hard task to determine if a company will get into a liquidity crisis or not. The use of data mining techniques based on the historic information of previous projects can help in the identification of potential risks during the loan approval cycle. Data mining techniques allow the automated analysis of large data sets finding patterns and trends that might otherwise go undiscovered. The liquidity crisis problem can be faced as a simple classification: to predict whether or not a promoter will present a good or poor credit risk. In this paper, we explain how data mining techniques have been used to predict the credit risk based on a set of 26 variables describing attributes of the companies with unknown semantics. The size of the training data set was 2, samples, with a class distribution of 1% positive cases, i.e. where a liquidity crisis occurred, and 9% negative cases. The test data consisted of 1, unlabeled samples. The objective was to success the greater number of true positive cases within the 2, samples regarded most likely to enter a liquidity crisis. In the next sections it will be explained the work developed for the creation of the data mining model for the assessment of the economical stability of the investors and the results achieved. We will start explaining the characteristics of the data set used in the work and the main tasks done to pre-process data. Next, we will describe the techniques used and the reasons of their choice. After this, it will be explained the methodology followed to select the more relevant variables which will be intervening in modelling. Finally it will be presented the results and conclusions achieved. 1. DATA UNDERSTANDING The data set used in this work was provided by the Deutsche Sparkassen- und Giroverband (DSGV) bank. Dataset consists in a blind historic set of 2, samples formed by 26 variables describing attributes of the companies and a binary variable registering if the company got into liquidity crisis or not. 1457
3 The first step for the creation of the model was concentrated in doing activities to get familiar with the data, to identify data quality problems, to discover first insights into the data or to detect interesting subsets to form hypotheses for hidden information. It was done a study of the variables of the process in order to determine their influence on the output, to detect relations between attributes, examine the quality of data and to analyze possible transformations and other data preparation necessary for further analysis. The preliminary analysis of data detected the presence of a 24% of missing values per feature on the average with a great number of missing data registered in the file with the value of The results of the data mining models have a strong dependency of the quality of data, so it is necessary to follow a strategy to handling the missing values. There are several alternatives for the treatment of missing values: filtering cases, replacement missing values (medium, average k-neighbour nearest), select a technique for modelling capable to treat with missing values or to estimate the values by means of information relative to the process. In general filtering cases is a good option, but in this case it was not possible because of the high number of missing values. The strategy adopted by the research team consisted in replace the missing values by the average, and add a new binary variable for each variable with missing values marking the position with missing data with a 1 and the other values with a. In this way the data mining technique is capable to hand missing data considering two possible cases, the case where the variable is known and the case with missing data. In order to extract information relative to the process variables were discretized using equidepth histograms and it was calculated the probability to get into liquidity crisis for each interval of the histogram. In the Figure 1 are shown two examples of the graphics obtained by the equi-depth discretization for two input variables: Var1 (left) and Var22 (right). The negative values of Var1 have greater probability to get into a liquidity crisis than the positive values of this variable, not being relevant the magnitude of the variable. In the case of VAR22 the magnitude of the variable is detected as relevant decreasing the risk when the magnitude of VAR22 increases. 3 VAR1 VAR22 Probability VAR1< Var1> VAR21< Figure 1: Discretization of the input variables and analysis of the probability to get into a liquidity crisis. Probability 25 5 VAR21>5 2. DESCRIPTION OF THE TECHNIQUE SELECTED The problem in study is a binary classification problem, to predict whether or not a promoter will get into a liquidity crisis. But we are also interested in predict the risk that a promoter get into a liquidity crisis. Besides, we have to be in consideration that the quality of data is poor with a high presence of missing values. By these reasons we have selected as data 1458
4 mining technique the ApiMARS, modification done by the research team of the standard MARS algorithm [Friedman, 1991], with capability to provides a measurement of the liquidity crisis risk and robust to noisy data. The MARS procedure builds flexible regression models by fitting separate splines (or basis functions) to distinct intervals of the predictor variables. Both the variables to use and the end points of the intervals for each variable -referred to as knots- are found via an exhaustive search procedure in a two-phase process. In the first phase, a model is grown by adding basis functions (new main effects, knots, or interactions) up to a maximum number predetermined by the user. In the second phase, basis functions are deleted in order of least contribution to the model until an optimal balance of bias and variance is found. It has been used for multidimensional fitting in several applications of multidisciplinary fields successfully, outperforming in sometimes the results of neural networks [Rodriguez, 23]. ApiMARS procedure is a modification of the MARS basic algorithm that determines automatically the number of basis functions in function of data, changing the forward/backward procedures to present different data in every step. It inherits from MARS its capability to selectively blank out some regions of a variable in order to focus on the most promising zones, which converts it in a good tool for finding interactions between variables and complex data structures. Besides the algorithm has the capacity to treat missing data, using basis functions which blank out the variable for the cases in which a variable contains missing values SELECTION OF RELEVANT VARIABLES Although in this problem the number of variables in consideration is relatively low (26 variables), is convenient to reduce the dimensionality of the input space and if it is possible decide if one or more attributes are more important than others weighting the attributes accordingly. For doing the tasks of selection of variables and weighting the relative importance of the variables it was followed an iterative pruning strategy developed with the ApiMARS algorithm. The iterative pruning strategy consists in the initial consideration of all the variables selected as candidates during the pre-process phase, training different models with ApiMARS for a great number of sets of patterns selected randomly (8% training 2% test). The different models are analysed weighting the variables for several parameters: importance of the variable into the model and importance of the model. The importance of the variable into the model is calculated using a sensitivity analysis of the loss of fit when the variable is removed and the importance of the model is calculated using the quadratic error medium for the test patterns. Irrelevant or low relevant variables are removed, and the process is repeated in an iterative manner until the entire no relevant (or very few relevant) variables are removed. The application of the strategy to the problem in study stopped in 3 iterations, detecting as no relevant 6 of the 26 variables. In the Figure 2 the variables are ordered by their relative importance. There are 5 variables (Var_23, Var_24, Var_3, Var_11 and Var_15) being the most important. So the reliability of the application depends of the quality of these variables 1459
5 being important to make an effort to collect these variables with the greatest quality as possible. 5 VAR23 VAR24 VAR3 VAR11 VAR15 VAR14 VAR21 VAR6 VAR2 VAR5 VAR1 VAR12 VAR1 VAR18 VAR4 VAR7 VAR26 VAR25 VAR22 VAR % Figure 2: Relative importance of variables 4. TRAINING AND TESTING The parameters of the data mining technique must be calibrated in order to avoid undertraining or over-fitting problems. In the Figure 3 is shown the usual behaviour of the performance of the model versus its complexity for the training and the test sample. The error evaluated in the training sample decreases with the complexity of the model but the error of the test training can increase because the loss of generality of the models. To optimise the results equilibrium between the training and test error must be found. 146
6 6 Figure 3: Perfomance of the model versus complexity. In order to evaluate the fitness of the models data were divided by means of 5-fold cross validation. That is, data set was divided into 5 subsets using 4 subsets for training the model and the other subset for test models, repeating the process 5 times. In the Figure 4 is shown the successful rate of the model evaluated in the test set in function of the measurement of the risk of crisis liquidity. The Figure 4 shows that the best rate to separate the promoters to get into a liquidity crisis is in the interval (.45,.55) with more than a 92% of right promoters classified. Succesful rate 93% 93% 92% 92% 91% 91% 9% 9% 89% 89% 88% Probability Estimated by Uniovi model Figure 4: Results of the model for the test sets. 1461
7 7 5. CONCLUSSIONS In this paper it is described a new method based in data mining techniques to assess the economical stability of project investors. It has been done a data set of 1, samples with 26 attributes for the creation of the model, being the percentage of positive cases of a 1%. The data set is characterised by the presence of a great number of missing values (24% in average by feature). To avoid this problem a new binary variable has been associated to each variable with missing values. The new variable marks with a 1 the null cases and with a the valid cases and the missing values have been replaced in the original variable by the average of the variable. ApiMars algorithm was selected as the data mining technique to be used for the selection of relevant variables and modelling. For this type of problems the collection of reliability data is one of the greater difficulties. The use of ApiMARS algorithm allows weighting the more relevant variables, so the efforts can be focused in the correct capture of this information. Parameters of the model were calibrated by means of a 5-fold cross validation in order to avoid under-fitting or over-fitting problems. The results obtained are very promising identifying successfully a 92% of the promoters getting into a liquidity crisis. 6. REFERENCES [1] Friedman, J. H. Multivariate adaptive regression splines. The Annals of Statistics, Vol. 19, Nº 1, 1-141, 1991 [2] Rodriguez Montequín, M. T. Modelado evolutivo mediante técnicas adaptativas aplicado al control de inclusiones en bobinas laminadas en caliente, Tesis doctoral, 23 [3] M. Teresa Rodríguez, Francisco Ortega, Jose Luis Rendueles, Cesar Menéndez: Combination Of Multivariate Adaptive Techniques And Neural Networks For Prediction And Control Of Internal Cleanliness In Steel Strips. Proceedings of EUNITE
Model reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl
Model reconnaissance: discretization, naive Bayes and maximum-entropy Sanne de Roever/ spdrnl December, 2013 Description of the dataset There are two datasets: a training and a test dataset of respectively
More informationAuditoría de modelos numéricos para macizos rocosos
Auditoría de modelos numéricos para macizos rocosos Dr. Alejo O. Sfriso Universidad de Buenos Aires materias.fi.uba.ar/6408 asfriso@fi.uba.ar SRK Consulting (Argentina) latam.srk.com asfriso@srk.com.ar
More informationApplication of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures
Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego
More informationTHE MOST RELEVANT VARIABLES TO SUPPORT RISK ANALYSTS FOR LOAN DECISIONS: AN EMPIRICAL STUDY CHARLO, María José *
Regional and Sectoral Economic Studies Vol. 10-1 (010) THE MOST RELEVANT VARIABLES TO SUPPORT RISK ANALYSTS FOR LOAN DECISIONS: AN EMPIRICAL STUDY CHARLO, María José * Abstract This paper presents an empirical
More informationNeed to make a phone call to someone who has a hearing loss or speech disability? Get Connected
Need to make a phone call to someone who has a hearing loss or speech disability? Get Connected with Ohio Relay! Bring people together with OHIO RELAY HOW DOES THE OHIO RELAY SERVICE WORK? 5 4 4 5 The
More informationHow to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection
How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection Esma Nur Cinicioglu * and Gülseren Büyükuğur Istanbul University, School of Business, Quantitative Methods
More information45º CONGRESO ESPAÑOL DE ACÚSTICA 8º CONGRESO IBÉRICO DE ACÚSTICA EUROPEAN SYMPOSIUM ON SMART CITIES AND ENVIRONMENTAL ACOUSTICS
PACS: 43.40.AT CONTRIBUTION TO THE EVALUATION OF SOUND QUALITY OF ELECTRIC MOTORS Poveda, Pedro 1 ; Miró, Inés 2 ; Barrocal, José 2 ; Martín, Vicente 2 ; Ramis, Jaime 1 1 Departamento de Física, Ingeniería
More informationNeed to make a phone call to someone who has a hearing loss or speech disability? Get Connected
Need to make a phone call to someone who has a hearing loss or speech disability? Get Connected with Rhode Island Relay! rhodeislandrelay.com Bring people together with RHODE ISLAND RELAY HOW DOES VOICE
More informationReliability. Scale: Empathy
/VARIABLES=Empathy1 Empathy2 Empathy3 Empathy4 /STATISTICS=DESCRIPTIVE SCALE Reliability Notes Output Created Comments Input Missing Value Handling Syntax Resources Scale: Empathy Data Active Dataset Filter
More informationPredicting Breast Cancer Survivability Rates
Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer
More informationDiscussion Meeting for MCP-Mod Qualification Opinion Request. Novartis 10 July 2013 EMA, London, UK
Discussion Meeting for MCP-Mod Qualification Opinion Request Novartis 10 July 2013 EMA, London, UK Attendees Face to face: Dr. Frank Bretz Global Statistical Methodology Head, Novartis Dr. Björn Bornkamp
More informationChapter 17 Sensitivity Analysis and Model Validation
Chapter 17 Sensitivity Analysis and Model Validation Justin D. Salciccioli, Yves Crutain, Matthieu Komorowski and Dominic C. Marshall Learning Objectives Appreciate that all models possess inherent limitations
More informationA Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China
A Vision-based Affective Computing System Jieyu Zhao Ningbo University, China Outline Affective Computing A Dynamic 3D Morphable Model Facial Expression Recognition Probabilistic Graphical Models Some
More informationTEMPORAL PREDICTION MODELS FOR MORTALITY RISK AMONG PATIENTS AWAITING LIVER TRANSPLANTATION
Proceedings of the 3 rd INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2008) J. Li, D. Aleman, R. Sikora, eds. TEMPORAL PREDICTION MODELS FOR MORTALITY RISK AMONG PATIENTS AWAITING LIVER
More informationEmpirical function attribute construction in classification learning
Pre-publication draft of a paper which appeared in the Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence (AI'94), pages 29-36. Singapore: World Scientific Empirical function
More informationDiscovering Meaningful Cut-points to Predict High HbA1c Variation
Proceedings of the 7th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 202) H. Yang, D. Zeng, O. E. Kundakcioglu, eds. Discovering Meaningful Cut-points to Predict High HbAc Variation Si-Chi
More informationA Biostatistics Applications Area in the Department of Mathematics for a PhD/MSPH Degree
A Biostatistics Applications Area in the Department of Mathematics for a PhD/MSPH Degree Patricia B. Cerrito Department of Mathematics Jewish Hospital Center for Advanced Medicine pcerrito@louisville.edu
More informationAutomated Assessment of Diabetic Retinal Image Quality Based on Blood Vessel Detection
Y.-H. Wen, A. Bainbridge-Smith, A. B. Morris, Automated Assessment of Diabetic Retinal Image Quality Based on Blood Vessel Detection, Proceedings of Image and Vision Computing New Zealand 2007, pp. 132
More informationPrediction of heart disease using k-nearest neighbor and particle swarm optimization.
Biomedical Research 2017; 28 (9): 4154-4158 ISSN 0970-938X www.biomedres.info Prediction of heart disease using k-nearest neighbor and particle swarm optimization. Jabbar MA * Vardhaman College of Engineering,
More informationReveal Relationships in Categorical Data
SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction
More informationPubH 7405: REGRESSION ANALYSIS. Propensity Score
PubH 7405: REGRESSION ANALYSIS Propensity Score INTRODUCTION: There is a growing interest in using observational (or nonrandomized) studies to estimate the effects of treatments on outcomes. In observational
More informationOutlier Analysis. Lijun Zhang
Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based
More informationDetecting Cognitive States Using Machine Learning
Detecting Cognitive States Using Machine Learning Xuerui Wang & Tom Mitchell Center for Automated Learning and Discovery School of Computer Science Carnegie Mellon University xuerui,tom.mitchell @cs.cmu.edu
More information1. Introduction. 2. Objective
. Introduction Tobacco quality is mainly determined by the maturity stage of the leaves. Only mature leaves show the physical and chemical properties that are well appreciated by smokers and therefore,
More informationClassification of ECG Data for Predictive Analysis to Assist in Medical Decisions.
48 IJCSNS International Journal of Computer Science and Network Security, VOL.15 No.10, October 2015 Classification of ECG Data for Predictive Analysis to Assist in Medical Decisions. A. R. Chitupe S.
More informationEvaluating Classifiers for Disease Gene Discovery
Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics
More informationDetection of Cognitive States from fmri data using Machine Learning Techniques
Detection of Cognitive States from fmri data using Machine Learning Techniques Vishwajeet Singh, K.P. Miyapuram, Raju S. Bapi* University of Hyderabad Computational Intelligence Lab, Department of Computer
More informationAN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES RISK IN PRIMARY CARE
Proceedings of the 3rd INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2008) J. Li, D. Aleman, R. Sikora, eds. AN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES
More informationMISSING MIXED MODE: ELEMENTAL STRUCTURES
MISSING MIXED MODE: ELEMENTAL STRUCTURES ESTRUCTURAS BÁSICAS DE LOS VALORES PERDIDOS EN ENCUESTAS CON MODOS MIXTOS Antonio Alaminos 1 Universidad de Alicante, España alaminos@ua.es Recibido: 20/07/2012
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and
More informationABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases
More informationRoadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:
Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers. Richard Simon, J Clin Oncol 23:7332-7341 Presented by Deming Mi 7/25/2006 Major reasons for few prognostic factors to
More informationInstructions for the Use of MInCiR Scale to Assess Methodological Quality in Prognosis Studies
Int. J. Morphol., 33(4):1553-1558, 2015. Instructions for the Use of MInCiR Scale to Assess Methodological Quality in Prognosis Studies Instrucciones para la Utilización de la Escala MInCiR para Valorar
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationBIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA
BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA PART 1: Introduction to Factorial ANOVA ingle factor or One - Way Analysis of Variance can be used to test the null hypothesis that k or more treatment or group
More informationNeed to make a phone call to someone who has a hearing loss or speech disability? Get Connected. ftri.org/relay. with Florida Relay!
Need to make a phone call to someone who has a hearing loss or speech disability? Get Connected with Florida Relay! ftri.org/relay Bring people together with FLORIDA RELAY Making calls through Florida
More informationA New Approach For an Improved Multiple Brain Lesion Segmentation
A New Approach For an Improved Multiple Brain Lesion Segmentation Prof. Shanthi Mahesh 1, Karthik Bharadwaj N 2, Suhas A Bhyratae 3, Karthik Raju V 4, Karthik M N 5 Department of ISE, Atria Institute of
More informationAnalysis of Hoge Religious Motivation Scale by Means of Combined HAC and PCA Methods
Analysis of Hoge Religious Motivation Scale by Means of Combined HAC and PCA Methods Ana Štambuk Department of Social Work, Faculty of Law, University of Zagreb, Nazorova 5, HR- Zagreb, Croatia E-mail:
More informationGLOBAL HEALTH. PROMIS Pediatric Scale v1.0 Global Health 7 PROMIS Pediatric Scale v1.0 Global Health 7+2
GLOBAL HEALTH A brief guide to the PROMIS Global Health instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Scale v1.0/1.1 Global Health* PROMIS Scale v1.2 Global Health PROMIS Scale v1.2 Global Mental 2a
More informationCompany Overview February 26, 2019
Company Overview February 26, 2019 SAFE HARBOR Cautionary Note Regarding Forward-Looking Statements Certain statements in this presentation constitute forward-looking statements, including, without limitation,
More information1/26/17 QUALITATIVE RESEARCH INTRODUCTIONS LEARNING OBJECTIVES QUALITATIVE RESEARCH WHAT IS QUALITATIVE RESEARCH?
INTRODUCTIONS Name Department QUALITATIVE RESEARCH Experience with qualitative research? UGA SEER CENTER - 2016 Jennifer Jo Thompson University of Georgia LEARNING OBJECTIVES QUALITATIVE RESEARCH Understand
More informationUsing Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s
Using Bayesian Networks to Analyze Expression Data Xu Siwei, s0789023 Muhammad Ali Faisal, s0677834 Tejal Joshi, s0677858 Outline Introduction Bayesian Networks Equivalence Classes Applying to Expression
More informationWhat s New in SUDAAN 11
What s New in SUDAAN 11 Angela Pitts 1, Michael Witt 1, Gayle Bieler 1 1 RTI International, 3040 Cornwallis Rd, RTP, NC 27709 Abstract SUDAAN 11 is due to be released in 2012. SUDAAN is a statistical software
More informationWhite Paper Estimating Complex Phenotype Prevalence Using Predictive Models
White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015
More informationMODEL SELECTION STRATEGIES. Tony Panzarella
MODEL SELECTION STRATEGIES Tony Panzarella Lab Course March 20, 2014 2 Preamble Although focus will be on time-to-event data the same principles apply to other outcome data Lab Course March 20, 2014 3
More informationApplication of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets
Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets Chih-Lin Chi a, W. Nick Street b, William H. Wolberg c a Health Informatics Program, University of Iowa b
More informationMoving beyond regression toward causality:
Moving beyond regression toward causality: INTRODUCING ADVANCED STATISTICAL METHODS TO ADVANCE SEXUAL VIOLENCE RESEARCH Regine Haardörfer, Ph.D. Emory University rhaardo@emory.edu OR Regine.Haardoerfer@Emory.edu
More informationPHYSICAL FUNCTION A brief guide to the PROMIS Physical Function instruments:
PROMIS Bank v1.0 - Physical Function* PROMIS Short Form v1.0 Physical Function 4a* PROMIS Short Form v1.0-physical Function 6a* PROMIS Short Form v1.0-physical Function 8a* PROMIS Short Form v1.0 Physical
More informationMultivariable Systems. Lawrence Hubert. July 31, 2011
Multivariable July 31, 2011 Whenever results are presented within a multivariate context, it is important to remember that there is a system present among the variables, and this has a number of implications
More informationThe Role of Face Parts in Gender Recognition
The Role of Face Parts in Gender Recognition Yasmina Andreu Ramón A. Mollineda Pattern Analysis and Learning Section Computer Vision Group University Jaume I of Castellón (Spain) Y. Andreu, R.A. Mollineda
More informationCOMPUTER AIDED DIAGNOSTIC SYSTEM FOR BRAIN TUMOR DETECTION USING K-MEANS CLUSTERING
COMPUTER AIDED DIAGNOSTIC SYSTEM FOR BRAIN TUMOR DETECTION USING K-MEANS CLUSTERING Urmila Ravindra Patil Tatyasaheb Kore Institute of Engineering and Technology, Warananagar Prof. R. T. Patil Tatyasaheb
More informationImpute vs. Ignore: Missing Values for Prediction
Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 Impute vs. Ignore: Missing Values for Prediction Qianyu Zhang, Ashfaqur Rahman, and Claire D Este
More informationPsychology 2019 v1.3. IA2 high-level annotated sample response. Student experiment (20%) August Assessment objectives
Student experiment (20%) This sample has been compiled by the QCAA to assist and support teachers to match evidence in student responses to the characteristics described in the instrument-specific marking
More informationBayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm
Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University
More informationOutlier detection in datasets with mixed-attributes
Vrije Universiteit Amsterdam Thesis Outlier detection in datasets with mixed-attributes Author: Milou Meltzer Supervisor: Johan ten Houten Evert Haasdijk A thesis submitted in fulfilment of the requirements
More informationPredicting the Effect of Diabetes on Kidney using Classification in Tanagra
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationPredicting New Customer Retention for Online Dieting & Fitness Programs
Predicting New Customer Retention for Online Dieting & Fitness Programs December 11, 2007 BUDT733 DC01 Team Four Amy Brunner Harin Sandhoo Lilah Pomerance Paola Nasser Srinath Bala Executive Summary GymAmerica.com
More informationClassification of Smoking Status: The Case of Turkey
Classification of Smoking Status: The Case of Turkey Zeynep D. U. Durmuşoğlu Department of Industrial Engineering Gaziantep University Gaziantep, Turkey unutmaz@gantep.edu.tr Pınar Kocabey Çiftçi Department
More informationQuickBooks Online Student Guide. Education Exercises Answer Key
QuickBooks Online Student Guide Education Exercises Answer Key 2 Education Exercises Answer Key Chapter 1 Answer Key Exercise #1 You ll know that the student has been successful if you receive an email
More informationANXIETY IN PEOPLE WHO STAMMER IN FOREIGN LANGUAGE LEARNING
ANXIETY IN PEOPLE WHO STAMMER IN FOREIGN LANGUAGE LEARNING RONAN MILLER Polytechnic University of Valencia RESUMEN Este estudio hace uso de un método original para describir los niveles de ansiedad en
More informationBarry A. Tanner and M. Cristina Ramírez
Journal of Behavior, Health & Social Issues vol 1 num 2 Nov 2009 Pp. 81-87 A Wi n d o w s Pr o g r a m to Assist in Writing Reports f o r the Mexican WAIS-III Un p r o g r a m a Wi n d o w s pa r a asistir
More informationMAPPING ON INTERNAL AND EXTERNAL NOISE LEVELS
A WRITE UP ON MAPPING ON INTERNAL AND EXTERNAL NOISE LEVELS BY ADEBIYI-WILLIAMS YETUNDE (ARC/09/7342) & ADEKUNLE ELIZABETH TOLUSE (ARC/09/7345) SUBMITTED TO THE DEPARTMENT OF ARCHITECTURE, SCHOOL OF ENVIRONMENTAL
More informationFramework for Comparative Research on Relational Information Displays
Framework for Comparative Research on Relational Information Displays Sung Park and Richard Catrambone 2 School of Psychology & Graphics, Visualization, and Usability Center (GVU) Georgia Institute of
More informationDevelopment of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images.
Development of Soft-Computing techniques capable of diagnosing Alzheimer s Disease in its pre-clinical stage combining MRI and FDG-PET images. Olga Valenzuela, Francisco Ortuño, Belen San-Roman, Victor
More informationMULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES
24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter
More informationTRIPLL Webinar: Propensity score methods in chronic pain research
TRIPLL Webinar: Propensity score methods in chronic pain research Felix Thoemmes, PhD Support provided by IES grant Matching Strategies for Observational Studies with Multilevel Data in Educational Research
More informationA NOVEL VARIABLE SELECTION METHOD BASED ON FREQUENT PATTERN TREE FOR REAL-TIME TRAFFIC ACCIDENT RISK PREDICTION
OPT-i An International Conference on Engineering and Applied Sciences Optimization M. Papadrakakis, M.G. Karlaftis, N.D. Lagaros (eds.) Kos Island, Greece, 4-6 June 2014 A NOVEL VARIABLE SELECTION METHOD
More informationAutomated Medical Diagnosis using K-Nearest Neighbor Classification
(IMPACT FACTOR 5.96) Automated Medical Diagnosis using K-Nearest Neighbor Classification Zaheerabbas Punjani 1, B.E Student, TCET Mumbai, Maharashtra, India Ankush Deora 2, B.E Student, TCET Mumbai, Maharashtra,
More informationData Management, Data Management PLUS User Guide
Data Management, Data Management PLUS User Guide Table of Contents Introduction 3 SHOEBOX Data Management and Data Management PLUS (DM+) for Individual Users 4 Portal Login 4 Working With Your Data 5 Manually
More informationReactive agents and perceptual ambiguity
Major theme: Robotic and computational models of interaction and cognition Reactive agents and perceptual ambiguity Michel van Dartel and Eric Postma IKAT, Universiteit Maastricht Abstract Situated and
More informationAsthma Surveillance Using Social Media Data
Asthma Surveillance Using Social Media Data Wenli Zhang 1, Sudha Ram 1, Mark Burkart 2, Max Williams 2, and Yolande Pengetnze 2 University of Arizona 1, PCCI-Parkland Center for Clinical Innovation 2 {wenlizhang,
More informationStatistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN
Statistical analysis DIANA SAPLACAN 2017 * SLIDES ADAPTED BASED ON LECTURE NOTES BY ALMA LEORA CULEN Vs. 2 Background 3 There are different types of research methods to study behaviour: Descriptive: observations,
More informationCardiac Arrest Prediction to Prevent Code Blue Situation
Cardiac Arrest Prediction to Prevent Code Blue Situation Mrs. Vidya Zope 1, Anuj Chanchlani 2, Hitesh Vaswani 3, Shubham Gaikwad 4, Kamal Teckchandani 5 1Assistant Professor, Department of Computer Engineering,
More informationA Comparison of Collaborative Filtering Methods for Medication Reconciliation
A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,
More informationGene Selection for Tumor Classification Using Microarray Gene Expression Data
Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology
More informationKeywords Missing values, Medoids, Partitioning Around Medoids, Auto Associative Neural Network classifier, Pima Indian Diabetes dataset.
Volume 7, Issue 3, March 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Medoid Based Approach
More informationStrategic Uncertainty and Risk Attitudes: The Experimental Connection*
Cuadernos de Economía. Vol. 27, 139-152, 2004 Strategic Uncertainty and Risk Attitudes: The Experimental Connection* Pablo Brañas-Garza 1 Francisca Jiménez-Jiménez Departamento de Economía, U. de Jaén,
More informationAlgorithms Implemented for Cancer Gene Searching and Classifications
Algorithms Implemented for Cancer Gene Searching and Classifications Murad M. Al-Rajab and Joan Lu School of Computing and Engineering, University of Huddersfield Huddersfield, UK {U1174101,j.lu}@hud.ac.uk
More informationApplied Statistical Analysis EDUC 6050 Week 4
Applied Statistical Analysis EDUC 6050 Week 4 Finding clarity using data Today 1. Hypothesis Testing with Z Scores (continued) 2. Chapters 6 and 7 in Book 2 Review! = $ & '! = $ & ' * ) 1. Which formula
More informationInternational Journal of Research in Science and Technology. (IJRST) 2018, Vol. No. 8, Issue No. IV, Oct-Dec e-issn: , p-issn: X
CLOUD FILE SHARING AND DATA SECURITY THREATS EXPLORING THE EMPLOYABILITY OF GRAPH-BASED UNSUPERVISED LEARNING IN DETECTING AND SAFEGUARDING CLOUD FILES Harshit Yadav Student, Bal Bharati Public School,
More informationJ2.6 Imputation of missing data with nonlinear relationships
Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael
More informationFrom single studies to an EBM based assessment some central issues
From single studies to an EBM based assessment some central issues Doug Altman Centre for Statistics in Medicine, Oxford, UK Prognosis Prognosis commonly relates to the probability or risk of an individual
More informationInstructions for the Use of MInCir Scale to Assess Methodological Quality in Therapy Studies
Int. J. Morphol., (4):46-467, 205. Instructions for the Use of MInCir Scale to Assess Methodological Quality in Therapy Studies Instrucciones para la Utilización de la Escala MInCir para Valorar Calidad
More informationMultiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012
Multiple Linear Regression (Dummy Variable Treatment) CIVL 7012/8012 2 In Today s Class Recap Single dummy variable Multiple dummy variables Ordinal dummy variables Dummy-dummy interaction Dummy-continuous/discrete
More informationUsing Bayesian Networks to Direct Stochastic Search in Inductive Logic Programming
Appears in Proceedings of the 17th International Conference on Inductive Logic Programming (ILP). Corvallis, Oregon, USA. June, 2007. Using Bayesian Networks to Direct Stochastic Search in Inductive Logic
More informationYouth Using Behavioral Health Services. Making the Transition from the Child to Adult System
Youth Using Behavioral Health Services Making the Transition from the Child to Adult System Allegheny HealthChoices Inc. January 2013 Youth Using Behavioral Health Services: Making the Transition from
More informationIndex. E Eftekbar, B., 152, 164 Eigenvectors, 6, 171 Elastic net regression, 6 discretization, 28 regularization, 42, 44, 46 Exponential modeling, 135
A Abrahamowicz, M., 100 Akaike information criterion (AIC), 141 Analysis of covariance (ANCOVA), 2 4. See also Canonical regression Analysis of variance (ANOVA) model, 2 4, 255 canonical regression (see
More informationStepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality
Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,
More informationT. R. Golub, D. K. Slonim & Others 1999
T. R. Golub, D. K. Slonim & Others 1999 Big Picture in 1999 The Need for Cancer Classification Cancer classification very important for advances in cancer treatment. Cancers of Identical grade can have
More informationMARS Ambulatory ECG Analysis The power to assess and predict
GE Healthcare MARS Ambulatory ECG Analysis The power to assess and predict Connecting hearts and minds Prevention starts with knowledge Around the world, heart disease is one of our fastest-growing health
More informationPositive and Unlabeled Relational Classification through Label Frequency Estimation
Positive and Unlabeled Relational Classification through Label Frequency Estimation Jessa Bekker and Jesse Davis Computer Science Department, KU Leuven, Belgium firstname.lastname@cs.kuleuven.be Abstract.
More informationUsing Health Economics to Inform the Development of Medical Devices. Matthew Allsop MATCH / BITECIC
Using Health Economics to Inform the Development of Medical Devices Matthew Allsop MATCH / BITECIC Overview Background to MATCH Overview of health economics in product development Concepts relating to
More informationHOW RARE IS YOUR BLOOD? DONATE ( ) CANADIAN BLOOD SERVICES RARE BLOOD PROGRAM
HOW RARE IS YOUR BLOOD? CANADIAN BLOOD SERVICES RARE BLOOD PROGRAM If you have a rare blood type, become a donor and help save lives across Canada and around the world. WHAT IS RARE BLOOD? Did you know
More informationControlled Experiments
CHARM Choosing Human-Computer Interaction (HCI) Appropriate Research Methods Controlled Experiments Liz Atwater Department of Psychology Human Factors/Applied Cognition George Mason University lizatwater@hotmail.com
More informationAristomenis Kotsakis,Matthias Nübling, Nikolaos P. Bakas, George Pelekanakis, John Thanopoulos
2nd International Conference on Sustainable Employability Building Bridges between Science and Practice - http://www.employability21.com/ 12-13 September 2018 Provinciehuis Vlaams Brabant, Leuven, Belgium
More informationPerformance Dashboard for the Substance Abuse Treatment System in Los Angeles
Performance Dashboard for the Substance Abuse Treatment System in Los Angeles Abdullah Alibrahim, Shinyi Wu, PhD and Erick Guerrero PhD Epstein Industrial and Systems Engineering University of Southern
More informationCocaine Use among High School Students in Six South American Countries. Marya Hynes Dowell 1 Héctor Suárez 2 Francisco Cumsille 3
Cocaine Use among High School Students in Six South American Countries Marya Hynes Dowell 1 Héctor Suárez 2 Francisco Cumsille 3 Abstract Objectives: To compare lifetime and past year prevalence estimates
More informationKeeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us
SIIM 2016 Scientific Session Quality and Safety Part 1 Thursday, June 30 8:00 am 9:30 am Keeping Abreast of Breast Imagers: Radiology Pathology Correlation for the Rest of Us Linda C. Kelahan, MD, Medstar
More informationReliability of feedback fechanism based on root cause defect analysis - case study
Annales UMCS Informatica AI XI, 4 (2011) 21 32 DOI: 10.2478/v10065-011-0037-0 Reliability of feedback fechanism based on root cause defect analysis - case study Marek G. Stochel 1 1 Motorola Solutions
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More information