How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection

Similar documents
A Comparison of Collaborative Filtering Methods for Medication Reconciliation

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

Using Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s

Bayesian Networks Representation. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University

Predicting Breast Cancer Survivability Rates

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

Prediction of Malignant and Benign Tumor using Machine Learning

CS 4365: Artificial Intelligence Recap. Vibhav Gogate

Evaluating Classifiers for Disease Gene Discovery

Using Bayesian Networks to Direct Stochastic Search in Inductive Logic Programming

A Bayesian Approach to Tackling Hard Computational Challenges

Identifying the Zygosity Status of Twins Using Bayes Network and Estimation- Maximization Methodology

TEACHING YOUNG GROWNUPS HOW TO USE BAYESIAN NETWORKS.

Improving the Accuracy of Neuro-Symbolic Rules with Case-Based Reasoning

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Chapter 1. Introduction

DIAGNOSIS AND PREDICTION OF TRAFFIC CONGESTION ON URBAN ROAD NETWORKS USING BAYESIAN NETWORKS

Selection and Combination of Markers for Prediction

Lecture 3: Bayesian Networks 1

Credal decision trees in noisy domains

Outline. What s inside this paper? My expectation. Software Defect Prediction. Traditional Method. What s inside this paper?

CHAPTER 3 DATA ANALYSIS: DESCRIBING DATA

Statistical Analysis Using Machine Learning Approach for Multiple Imputation of Missing Data

A Bayesian Network Model of Knowledge-Based Authentication

Variable Features Selection for Classification of Medical Data using SVM

Model reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl

AALBORG UNIVERSITY. Prediction of the Insulin Sensitivity Index using Bayesian Network. Susanne G. Bøttcher and Claus Dethlefsen

Towards Effective Structure Learning for Large Bayesian Networks

On the Combination of Collaborative and Item-based Filtering

Decisions and Dependence in Influence Diagrams

Application of Bayesian Network Model for Enterprise Risk Management of Expressway Management Corporation

Automated Medical Diagnosis using K-Nearest Neighbor Classification

You must answer question 1.

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Artificial Intelligence For Homeopathic Remedy Selection

Expert System Profile

Case Studies of Signed Networks

Keywords Artificial Neural Networks (ANN), Echocardiogram, BPNN, RBFNN, Classification, survival Analysis.

Application of Tree Structures of Fuzzy Classifier to Diabetes Disease Diagnosis

Representing Association Classification Rules Mined from Health Data

Event Classification and Relationship Labeling in Affiliation Networks

Identification of Tissue Independent Cancer Driver Genes

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

CHAPTER 3 PROBLEM STATEMENT AND RESEARCH METHODOLOGY

Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department

Bayesian (Belief) Network Models,

A Cooperative Multiagent Architecture for Turkish Sign Tutors

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

A Cue Imputation Bayesian Model of Information Aggregation

You can use this app to build a causal Bayesian network and experiment with inferences. We hope you ll find it interesting and helpful.

Color Difference Equations and Their Assessment

Towards More Confident Recommendations: Improving Recommender Systems Using Filtering Approach Based on Rating Variance

2016 Children and young people s inpatient and day case survey

A FRAMEWORK FOR CLINICAL DECISION SUPPORT IN INTERNAL MEDICINE A PRELIMINARY VIEW Kopecky D 1, Adlassnig K-P 1

An assistive application identifying emotional state and executing a methodical healing process for depressive individuals.

Using Bayesian Networks for Daily Activity Prediction

Rethinking Cognitive Architecture!

and errs as expected. The disadvantage of this approach is that it is time consuming, due to the fact that it is necessary to evaluate all algorithms,

Statistics are commonly used in most fields of study and are regularly seen in newspapers, on television, and in professional work.

Bayesian approaches to handling missing data: Practical Exercises

Lecturer: Rob van der Willigen 11/9/08

Recent advances in non-experimental comparison group designs

Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool

Lecturer: Rob van der Willigen 11/9/08

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

A Comparison of Three Measures of the Association Between a Feature and a Concept

Prognostic Prediction in Patients with Amyotrophic Lateral Sclerosis using Probabilistic Graphical Models

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

Comparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

Consumer Review Analysis with Linear Regression

Graphical Modeling Approaches for Estimating Brain Networks

Modeling Sentiment with Ridge Regression

Numerical Integration of Bivariate Gaussian Distribution

Adaptive Thresholding in Structure Learning of a Bayesian Network

Applying Machine Learning Techniques to Analysis of Gene Expression Data: Cancer Diagnosis

IE 5203 Decision Analysis Lab I Probabilistic Modeling, Inference and Decision Making with Netica

A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction

We, at Innovatech Group, have designed xtrack, an easy-to-use workout application that tracks the fitness progress of the user, asking the user to

MODEL SELECTION STRATEGIES. Tony Panzarella

Representation and Analysis of Medical Decision Problems with Influence. Diagrams

Empirical function attribute construction in classification learning

Prediction Model For Risk Of Breast Cancer Considering Interaction Between The Risk Factors

Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework

A Rough Set Theory Approach to Diabetes

Data Mining and Knowledge Discovery: Practice Notes

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

John Smith 20 October 2009

The Mapping and Analysis of Transportation Needs in Haliburton County Analytical Report. Breanna Webber Viyanka Suthaskaran

Prediction of Average and Perceived Polarity in Online Journalism

Scientific Journal of Informatics Vol. 3, No. 2, November p-issn e-issn

Predicting Diabetes and Heart Disease Using Features Resulting from KMeans and GMM Clustering

Breast screening: understanding case difficulty and the nature of errors

Using Association Rule Mining to Discover Temporal Relations of Daily Activities

Two-sample Categorical data: Measuring association

A Hierarchical Artificial Neural Network Model for Giemsa-Stained Human Chromosome Classification

Comparative analysis of data mining tools for lungs cancer patients

Trading off coverage for accuracy in forecasts: Applications to clinical data analysis

Transcription:

How to Create Better Performing Bayesian Networks: A Heuristic Approach for Variable Selection Esma Nur Cinicioglu * and Gülseren Büyükuğur Istanbul University, School of Business, Quantitative Methods Division, Avcilar, 34322, Istanbul, Turkey esmanurc@istanbul.edu.tr, gulsayici@gmail.com Abstract. Variable selection in Bayesian networks is necessary to assure the quality of the learned network structure. Cinicioglu & Shenoy (2012) suggested an approach for variable selection in Bayesian networks where a score, S j, is developed to assess each variable whether it should be included in the final Bayesian network. However, with this method the without parents or children are punished which affects the performance of the learned network. To eliminate that drawback, in this paper we develop a new score, NS j. We measure the performance of this new heuristic in terms of the prediction capacity of the learned network, its lift over marginal and evaluate its success by comparing it with the results obtained by the previously developed S j score. For the illustration of the developed heuristic and comparison of the results credit score data is used. Keywords: Bayesian networks, Variable selection, Heuristic. 1 Introduction The upsurge of popularity of Bayesian networks brings a parallel increase in research for structure learning algorithms of Bayesian networks from data sets. The ability of Bayesian networks to represent the probabilistic relationships between the is one of the main reasons of the rise in reputation of Bayesian networks as an inference tool. This also generates the major appeal of Bayesian networks for data mining. With the advancement and diversification of the structure learning algorithms, more may be incorporated to the learning process, bigger data sets may be used for learning, and inferences become faster even in the presence of continuous. The progress achieved on structure learning algorithms for Bayesian networks is encouraging for the increasing use of Bayesian networks as a general decision support system, a data mining tool and for probabilistic inference. On the other hand, though the quality of a learned network may be evaluated by many different aspects, the performance of the learned network very much depends on the selection of the to be included to the network. Depending on the purpose of the application, the characteristics of an application may differ and hence the expectations from a Bayesian network performance may vary. Therefore to assure to end up with a Bayesian * Corresponding author. A. Laurent et al. (Eds.): IPMU 2014, Part I, CCIS 442, pp. 527 535, 2014. Springer International Publishing Switzerland 2014

528 E.N. Cinicioglu and G. Büyükuğur network of high quality, variable selection in Bayesian networks should constitute an important dimension of the learning process. There is a considerable literature in statistics on measures like AIC, BIC, Caplow s C-p statistic etc. that are used for variable selection in statistical models. These measures have been adopted by the machine learning community for evaluating the score based methods for learning Bayesian network models (Scutari, 2010). However, these scores are used as a measure of the relative quality of the learned network and do not assist in the variable selection process. Additionally, as discussed in Cui et al. (2010), traditional methods of stepwise variable selection do not consider the interrelations among and may not identify the best subset for model building. Despite the interest for structure learning algorithms and adaptation of different measures for the evaluation of the resulting Bayesian networks, variable selection in Bayesian networks is a topic which needs further attention of the researchers. Previously Koller and Sahmi (1996) elaborate the importance of feature selection and state that the goal should be to eliminate a feature if it gives us little or no additional information. Hruschka et al. (2004) described Bayesian feature selection approach for classification problems. In their work, first a BN is created from a dataset and then the Markov blanket of the class variable is used to the feature subset selection task. Sun & Shenoy (2007) provided a heuristic method to guide the selection of in naïve Bayes models. To achieve the goal, the proposed heuristic relies on correlations and partial correlations among. Another heuristic developed for variable selection in Bayesian networks was proposed by Cinicioglu & Shenoy (2012). With this heuristic a score called S j was developed which helps to determine the to be used in the final Bayesian network. By this heuristic first an initial Bayesian network is developed with the purpose of learning the conditional probability tables (cpts) of all the in the network. The cpts indicate the association of a variable with the other in the network. Using the cpt of each variable, its corresponding S j score is calculated. In their paper Cinicioglu & Shenoy (2012) illustrate that by applying proposed heuristic the performance of the learned network in terms of the prediction capacity may be improved substantially. In this paper first we discuss the S j score, and then identify the problem that though the S j score demonstrates a sound performance on prediction capacity, its formula leads to the problem that the without parents or children in the network are punished and that in turn affects the overall performance of the heuristic. Trying to eliminate that drawback, in this paper we suggest a modified version of the S j score, which is called as NS j. We measure the performance of this new score in terms of the prediction capacity of the learned network, its lift compared to the marginal model and evaluate its success by comparing it with the results obtained by the previously developed S j score. For the illustration of the developed heuristic and comparison of the results credit score data is used. The outline of the remainder of the paper is as follow: The next section gives details about the credit data set used for the application of the proposed heuristic. In section 3 the development of the new heuristic is explained, where both S j and NS j scores are discussed in detail in subsections 3.1 and 3.2 respectively. In section 4, using both of the variable selection scores S j and NS j, different Bayesian networks are created. The performance results of these two heuristics are compared in terms of the prediction capacity and improvement rates obtained compared to the marginal model.

How to Create Better Performing Bayesian Networks 529 2 Data Set The data set used in this study is a free data set, called the German credit data, provided by the UCI Center for Machine Learning and Repository Systems. The original form of the data set contains the information of 1000 customers about 20 different attributes, 13 categorical and 7 numerical, giving the information necessary to evaluate a customer s eligibility to get a credit. Before the use of the data set for the application of the proposed heuristics several changes are made in the original data set. In this research, the German credit data set is transformed into a form where the numerical attributes Duration in month, Credit amount, Installment rate in percentage of disposable income, Present residence since, Age in years, Number of existing credits at this bank and Number of people being liable to provide maintenance for are discretized. The variable Personal status and sex is divided into two categorical as Personal status and Sex. In the original data set the categorical variable Purpose contains eleven different states. In this paper some of these states are joined together, like car and used car as car, furniture, radio and domestic appliances as appliances and retraining and business as business, resulting in seven different states at the end. The final data set used in this study constitutes of 21 columns and 1000 lines, referring the number of and cases consequently. 3 Development of the Proposed Heuristic 3.1 S j Score The heuristic developed by Cinicioglu & Shenoy (2012) is based on the principle that a good prediction capacity of a Bayesian network depends on the choice of the that have high associations with each other. A marginal variable present in a network will not have any dependencies with the remaining in the network and thus won t have any impact for the overall performance of the network. In that instance, the arcs learned using an existing structure learning algorithm shows the dependency of a child node with its parent node, hence a proof of association. However, not all which do not place themselves as marginals, can be incorporated to the final Bayesian network. The idea is to develop an efficient heuristic for variable selection where the Bayesian network created using the selected will show a superior prediction performance compared to the random inclusion of to the network. Besides, though the presence of an arc shows the dependency relationship between two in the network, the degree of association is not measured there and may vary quite differently among. A natural way to examine the association of a variable with other considered for inclusion in the final Bayesian network is to learn an initial Bayesian network structure at first and then use the conditional probability tables of each variable as a source of measurement for the degree of association.

530 E.N. Cinicioglu and G. Büyükuğur Applying the distance measure to the conditional probability table of a variable, the degree of change on the conditional probabilities of a child node depending on the states of its parents may be measured. In that instance a high average distance obtained indicate that the conditional probability of the variable considered changes a great deal depending on the states of its parents. Thus, a high average distance is an indication of the high association of a child node with its parents. The average distance of each variable may be calculated using the formula given below. Here d represents the average distance of the variable of interest with its parent. p and q stand for the conditional probabilities of this variable for the different states of its parents, i stands for the different states of the child node and n stands for the number of states of the set of parent nodes. /, (1) 2 However, there may be in the network which do not have a high level of association with its parent node but do possess a high association with its children. Basing the selection process on the average distance of each variable solely will deteriorate the performance of the network created. Besides, while the average distance obtained from the cpt of a variable shows the degree of association of a child node with its parents, the same average distance also shows the degree of association of a parent node with its child, jointly with the child s other parents. Following this logic Cinicioglu & Shenoy (2012) developed the S j score given in Equation (2) below. In this formula the S j score of a variable j is the sum of the average distance of this variable d j and the average of the average distances of its children. Here ij denotes the child variable i of the variable j and c j denotes the number of j s children. (2) Consider Table 1 given below. This table is the cpt of the variable Credit amount. Using the formula given in Equation (1) the average distance of this variable is calculated as 0.0107. Considering Figure 1 given below, we see that Credit Amount possesses three children. Hence in order to calculate the S j score of Credit Amount we need to find the average distances of the child, average them and then add it to the average distance of the Credit Amount. A high S j score is desired as an indication of the high association with other. Ideally, according to the heuristic, the variable with the lowest S j score will be excluded from the analysis and a new BN will be created with the remaining. This network will include the new cpts which will be the basis for the selection of the variable to be excluded from the network. This process is repeated until the desired number of is obtained. This repeated process is the ideal way of applying the heuristic, however if not automated will require a great deal of time. In the following, subsection 3.2, the shortcomings of the S j score are discussed. As a modification of the S j score to handle the problems involved with the old variable selection method, a new score called NS j is suggested.

How to Create Better Performing Bayesian Networks 531 Table 1. Cpt of the variable Credit Amount Credit Amount Telephone 0-4000 4000-8000 8000-12000 12000-16000 16000-20000 None 0.8286 0.1364 0.0300 0.0033 0.0017 Yes 0.6308 0.2347 0.0807 0.0489 0.0049 Fig. 1. Variable Credit Amount with its three children and calculation of S Credit Amount 3.2 A New Variable Selection Score: NS j The heuristic developed by Cinicioglu & Shenoy (2012) tries to identify the which possess a high level of association with their parent and child. With that purpose the variable selection score developed, S j, is comprised of two parts: S j is the sum of the average distance of the variable of interest and the average of the average distances of its children. This way, with the S j score the variable is evaluated by considering both the association with its parents and also with its children. However, this approach also has the drawback that the without parents or children are penalized for inclusion to the final Bayesian network. Consider the formula of the S j score given in Equation (2). A variable without parents will only have a marginal probability distribution, not a cpt, and thus its average distance will be considered as zero. Similarly, for a variable which does not have any children the S j score will be equal to its average distance. The resulting S j scores for a variable without parents and for a variable without children are given in Equations (3) and (4) respectively. For a variable j without parents (3) For a variable j without children (4) As illustrated above because of the formulation of the S j score, which do not possess parents or children will be punished in the variable selection process. If such a variable which lacks parents or children has a strong association with the present part

532 E.N. Cinicioglu and G. Büyükuğur (its parents or children depending on the case) though, then this selection process may cause to create networks with lower performance. To overcome this problem, in this research, a modified version of the S j score, NS j, is presented. For which lack either parents or children the score will remain to be the same as the old one. For which possess both parents and children on the other hand, NS j will be equal to the half of the old S j score. These two cases are formulated in Equation (5) and (6) given below. For a variable j without parents or children For a variable j both with parents and children (5) The which don t have any parents or children will be eliminated from the network. In the following section both of these heuristics will be used to learn BNs from the credit data set introduced in Section 2, their performance will be evaluated in terms of the prediction capacity and improvement obtained compared to the marginal model. (6) 4 Evaluation of the Proposed Heuristic In this section the performance of the variable selection scores S j and NS j are compared. The evaluation is made in terms of the prediction capacity and improvement of the BNs created using the suggested scores. For the application of the heuristic, first, it is necessary to learn an initial BN from the data set. For illustration and evaluation of the suggested scores the credit data set given in Section 2 will be used. For learning BNs from the data set WinMine, software (Heckerman et al., 2000) developed by Microsoft Research, is used. The main advantage of WinMine is its ability to automatically calculate log-scores and lift over marginals of the learned BNs. Log-score is a quantitative criterion to compare the quality and performance of the learned BNs. The formula for the log score is given below.,, / (7) where n is the number of, and N is the number of cases in the test set. For the calculation of the log-score, the dataset is divided into a 70/30 train and test split 1 and the accuracy of the learned model on the test set is then evaluated using the log score. Using WinMine the difference between the log scores of the provided 1 In WinMine only the percentage of the test/training test data may be determined. Using a different software in further research 10-fold cross validation will increase the validity of the results.

How to Create Better Performing Bayesian Networks 533 model and the marginal model can also be compared which is called as the lift over marginal. A positive difference signifies that the model out-performs the marginal model on the test set. The initial BN learned from the credit data set is given in Figure 2 below. Fig. 2. The initial BN learned from the credit data set containing all of the Using the cpts obtained through the initial BN we can calculate both the S j and NS j scores. Figure 3 given below depicts the graph of both S j and NS j scores for the 21 used in the initial BN. The observations made are as follows: For seven in the network the corresponding S j and NS j scores agree. These seven are the ones which either lack parents or children. Fig. 3. Graph of the S j and NS j scores calculated using the cpts obtained from the initial BN In our analysis we want to compare the performance of these two variable selection scores. With that purpose two sets of are created, one by selecting the with the highest S j scores and the second with the highest NS j scores. Using the selected the corresponding BNs are learned. The performance of the BNs

534 E.N. Cinicioglu and G. Büyükuğur are compared in terms of prediction capacity of the provided model and in terms of the improvement obtained. As the next step the same process is repeated by using the cpts of the new BNs to calculate the new S j and NS j scores. Accordingly, the to be excluded from the network is decided according their ranking on the variable selection score considered, S j or NS j. In our analysis, we repeated the steps five times and created BNs using 17, 15, 13, 11 and 8, all selected according their ranking in the corresponding variable selection scores. The results of their performance are listed in Table 2 given below. Both of the variable selection scores obtain better results compared to the marginal model and also the average distance measure. Notice that also the results of the BNs created using the average distance d j are listed in the same table. This is done for comparison purposes to illustrate that both of the variable selection scores do result in superior performance compared to the average distance measure. Additionally, in almost all the networks considered except the BN with 17, we obtained better performing networks using the NS j score both in terms of the prediction capacity and improvement obtained. Table 2. Performance results of the variable selection scores S j and NS j 2 LogScore Prediction rate Lift Over Marginal Improvement obtained initial BN 0.76 59.13% 0.19 7.13% Top 17 Top 15 Top 13 Top 11 Top 8 d j 0.83 56.38% 0.17 6.30% S j 0.73 60.30% 0.21 8.20% NS j 0.77 58.58% 0.19 7.36% d j 0.77 58.46% 0.19 7.18% S j 0.72 60.87% 0.22 8.48% NS j 0.72 60.87% 0.22 8.48% d j 0.78 58.29% 0.21 7.84% S j 0.73 60.23% 0.20 7.65% NS j 0.66 63.27% 0.22 8.95% d j 0.73 60.39% 0.19 7.35% S j 0.72 60.66% 0.19 7.41% NS j 0.65 63.74% 0.22 9.06% d j 0.76 58.87% 0.18 6.97% S j 0.67 62.65% 0.18 7.44% NS j 0.66 63.14% 0.22 9.08% 5 Results, Conclusions and Further Research In order to ensure the prediction capacity of a BN learned from the data set and to be able to discover hidden information inside a big data set it is necessary to select the right set of to be used in the BN to be learned. This problem is especially 2 The results are rounded to two decimal places.

How to Create Better Performing Bayesian Networks 535 apparent when there is a huge set of and the provided data is limited. In the last decade the research on structure learning algorithms for BNs have grown substantially. Though, there exists a wide research for variable selection in statistical models, the research conducted for variable selection in BNs remains to be limited. The variable selection measures developed for statistical models have been adapted by the machine learning community for evaluating the overall performance of the BN and do not provide guidance in variable selection for creating a good performing BN. The variable selection score S j (Cinicioglu &Shenoy, 2012), provides a sound performance for prediction capacity of the resulting network, however has the drawback that the without parents or children punished for inclusion to the network. Motivated by that problem in this research we suggest a modification to the S j score, called as NS j which fixes the problems inherent in its predecessor S j. A credit score data set is used for applying the proposed heuristic. The performance of the resulting BNs using the proposed heuristic is evaluated using logscore and lift over marginal which provides the prediction capacity of the network and the improvement obtained using the provided model compared to the marginal model. These results are compared with the results obtained using the distance measure and the S j score. Accordingly, the new developed NS j score show better performance both in terms of prediction capacity and the improvement obtained. For further research, different variable selection scores from statistical models and different data sets may be used to evaluate the results of the proposed heuristic. Acknowledgements. We are grateful for two anonymous reviewers of IPMU-2014 for comments and suggestions for improvements. This research was funded by Istanbul University Research fund, project number 27540. References 1. Cinicioglu, E.N., Shenoy, P.P.: A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores. Annals of Operations Research, 1 21 (2012) 2. Cui, G., Wong, M.L., Zhang, G.: In Bayesian variable selection for binary response models and direct marketing forecasting. Expert Systems with Applications 37, 7656 7662 (2010) 3. Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency Networks for Inference, Collaborative Filtering, and Data Visualization. Journal of Machine Learning Research 1, 49 75 (2000) 4. Hruschka Jr., E.R., Hruschka, E.R., Ebecken, N.F.F.: Feature selection by bayesian networks. In: Tawfik, A.Y., Goodwin, S.D. (eds.) Canadian AI 2004. LNCS (LNAI), vol. 3060, pp. 370 379. Springer, Heidelberg (2004) 5. Koller, D., Sahami, M.: Toward optimal feature selection (1996) 6. Murphy, P.M., Aha, D.W.: UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine, CA (1994) 7. Sun, L., Shenoy, P.P.: Using Bayesian networks for bankruptcy prediction: some methodological issues. European Journal of Operational Research 180(2), 738 753 (2007)