administration neural network vs. induction methods for knowledge classification

Decson support methods n dabetc patent management by nsuln admnstraton neural network vs. nducton methods for knowledge classfcaton Ambrosadou, B, Vadera, S, Shankararaman, V and Gouls, D Ttle Authors Type URL Publshed Date 2000 Decson support methods n dabetc patent management by nsuln admnstraton neural network vs. nducton methods for knowledge classfcaton Ambrosadou, B, Vadera, S, Shankararaman, V and Gouls, D Conference or Workshop Item Ths verson s avalable at: http://usr.salford.ac.uk/9403/ USIR s a dgtal collecton of the research output of the Unversty of Salford. Where copyrght permts, full text materal held n the repostory s made freely avalable onlne and can be read, downloaded and coped for non commercal prvate study or research purposes. Please check the manuscrpt for any further copyrght restrctons. For more nformaton, ncludng our polcy and submsson procedure, please contact the Repostory Team at: usr@salford.ac.uk.

From: B.V. Ambrosadou, S. Vadera, V. Shankararaman, D. Gouls, and G. Gogou, Decson Support Methods n Dabetc Patent Management by Insuln Admnstraton. Neural Networks vs Inducton Methods for Knowledge Classfcaton Proc. 2nd ICSC Symposum on Neural Computaton, Berln, (2000). Decson Support Methods n Dabetc Patent Management by Insuln Admnstraton Neural Network vs. Inducton Methods for Knowledge Classfcaton B.V.Ambrosadou 1, S. Vadera 2, V. Shankararaman 1, D. Gouls 3, 4 1 Department of Computer Scence, Unversty of Hertfordshre, England 2 Department of Computer Scence, Unversty of Salford, England 3 Clncal Research Fellow n Endocrnology & Dabetes, Imperal College School of Medcne, St. Mary's Hosptal, London, UK 4 Medcal Informatcs Laboratory, Arstotelon Unversty of Thessalonk Abstract Dabetes melltus s now recognsed as a major worldwde publc health problem. At present, about 100 mllon people are regstered as dabetc patents. Many clncal, socal and economc problems occur as a consequence of nsuln-dependent dabetes. Treatment attempts to prevent or delay complcatons by applyng optmal glycaemc control. Therefore, there s a contnuous need for effectve montorng of the patent. Gven the popularty of decson tree learnng algorthms as well as neural networks for knowledge classfcaton whch s further used for decson support, ths paper examnes ther relatve merts by applyng one algorthm from each famly on a medcal problem; that of recommendng a partcular dabetes regme. For the purposes of ths study, OC1 a descendant of Qunlan s ID3 algorthm was chosen as decson tree learnng algorthm and a generatng shrnkng algorthm for learnng arbtrary classfcatons as a neural network algorthm. These systems were traned on 646 cases derved from two countres n Europe and were tested on 100 cases whch were dfferent from the orgnal 646 cases. 1.0 Introducton Dabetes s a chronc dsease. The prevalence of dabetes melltus s rsng, especally n developng countres as they adopt a Western lfestyle [1]. Long term complcatons nvolvng the eyes, the kdneys, the central and perpheral nervous system may appear to those patents that do not acheve an optmal glycaemc control. Dabetc patent management may be acheved by approprate det, exercse and nsuln admnstraton for glycaemc control of Type I or Type II nsuln dependent dabetc patents. There s dvergence of opnons regardng the ssue of nsuln admnstraton among the experts, whch manly depends on soft nformaton such as ther educatonal and socal background as well as ther experence and gut feel. An effort has been made to systematse the process of nsuln admnstraton by developng decson support tools, whch facltate a consstent and objectve decson makng among specalsts [2,3]. The knowledge n ths feld has been manly acqured from a number of sources across Europe n order to allevate the dfferences due to lfestyles and det habts. A DELPHI approach was followed n order to arrve at consensus opnon n those cases where there was dvergence [4]. In ths paper two methods are compared for knowledge classfcaton n nsuln admnstraton whch may later be used for decson support purposes. These are the decson tree learnng and the neural network algorthmc approaches. The followng sectons descrbe the doman under consderaton, the two methods as well as the results obtaned. 1.1 The Doman: Insuln Regme Prescrpton and Dose Adjustment Insuln regmens and dose adjustment are prescrbed by dabetologsts dependng on a number of factors such as dabetes type, patent age, actvty durng the day and control targets. There are three nsuln types dependng on the begnnng and duraton of ther acton. These are the fast, ntermedate and long actng nsuln types. Insuln regmens are nsuln types n combnatons admnstered n daly profles. As nsuln admnstraton proceeds food ntake by approxmately half an hour, t may take place before breakfast, lunch, dnner and bed.

The most wdely used nsuln regmens are: 1. short-actng nsuln mxed wth ntermedateactng nsuln, gven twce daly before meals 2. short-actng nsuln mxed wth ntermedateactng nsuln, gven before breakfast AND shortactng nsuln, gven before evenng meal AND ntermedate-actng nsuln, gven at bedtme 3. short-actng nsuln, gven three tmes daly AND ntermedate-actng nsuln, gven at bedtme 4. ntermedate-actng nsuln, gven once daly Regmen No Pre-Breakfast Pre-Lunch Pre-Tea Pre-Bed 1 Medum actng - - - 2 Fast+Medum actng - Fast+Medum actng - 3 Fast actng Fast actng Fast actng Medum actng 4 Fast actng Fast actng Fast actng Fast actng Table 1. Most common regmens. Step No Subject Acton 1 group of experts n UK & Greece complaton of lst : most wdely-used nsuln regmens 2 authors complaton of questonnare : decson-makng parameters for nsuln regmen selecton 3 group of experts n UK & Greece completon of questonnare : decson-makng parameters for nsuln regmen selecton 4 authors software development : Machne Learnng (ML) & Neural Network (NN) for Regmen Adjustment 5 group of experts n UK & Greece completon of a table : 646 dabetc cases complete wth approprate regmen 6 authors Software tranng : tranng of ML & NN Regmen Adjustment software, usng the 646 cases 7 authors Software evaluaton : evaluaton of ML & NN Regmen Adjustment software, usng 100 addtonal cases Table 2. Study desgn. Generally, the regmens that contan long actng nsuln are not prescrbed because they are consdered outdated and they are used only out of necessty [6-8], especally n elderly patents. The ntensve glucose control (regmen 3) gves the best results n terms of blood glucose control but, on the other hand, results n more hypoglycaemc epsodes (low blood glucose) whch s a, potentally, dangerous stuaton. The result of ntensve glucose control s the most frequent appearance of hypoglycema (low blood glucose) whch s even more dangerous than hyperglycema (hgh blood glucose values) [6], [7], [9]. Dependng on specal occasons other regmens can also be prescrbed. These lfe crcumstances nclude coexstence of another acute or chronc dsease, short lfe expectancy, honeymoon perod, psychosomatc problems due to njectons, the envronment and nablty of the patent to understand the demands of a partcular regmen [6-8]. 2.0 The Comparatve Study An Overvew 2.1 Study Desgn The prototype systems descrbed n ths paper classfy clncans knowledge n the doman of nsuln admnstraton. In ths way, they may further support decson makng n ths feld. The whole study desgn s presented n Table 2. 2.2 System Knowledge For the purposes of the development of the ML & NN software for nsuln regmen specfcaton, a lst of

regmens was compled by ntervewng dabetes experts n the Unted Kngdom and Greece (Step 1). Subsequently, a questonnare was prepared (Step 2) and was sent to three dabetc departments of UK and fourteen dabetologcal centres n Greece. The questonnare was askng for the parameters that were necessary n order to decde for each specfc nsuln regmen (Step 3), among the ones that were proposed n Step 1. The outcome of the above process was a choce of the man factors ntutvely used by doctors for nsuln regmen prescrpton. These are: Dabetes type (unknown, type I, type II) Patent age What the patent s used to takng (unknown, tablets, nsuln) Specal condton (pregnancy, surgery, nfecton) Dawn phenomenon (yes, no) Unstable ("brttle") dabetes (yes, no) Glucose profle (mornng, afternoon, evenng, nght / unknown, normal, hyperglycema, hypoglycema) Physcal actvty (mornng-noon, afternoonevenng, nght / none-unknown, sedentary, lght, heavy) Food ntake (breakfast, lunch, tea, dnner) Desrable blood glucose control (far, good, very good) In the same way the man factors used by doctors for nsuln dose adjustment are: Insuln regmen Current dose Glucose values Glucose profle (mornng, afternoon, evenng, nght / unknown, normal, hyperglycema, hypoglycema). Cas DM age spe Pre targdawuns BGludbemorafnbre BG- BG- BG-PA- PA- PA- FI- e typ cal vo et n tablbre No e us Rx e FIludn FI- FIbed Reg me n No 1 1 n vg n n n h h h h l - y y y y 4 2 1 40 y vg n n h nl nl nl s - - y y y n 4 Table 3. Parameters for nsuln regmen prescrpton wth two sample cases. parameter values explanaton DM type 1 dabetes melltus, type 2 age number age n years specal yes (y) specal condton : pregnancy, surgery, nfecton prevous Rx nsuln () patent's prevous regme tablets (t) target far (f) desrable dabetes control good (g) very good (vg) dawn yes (y) dawn phenomenon unstable yes (y) unstable dabetes BG-bre normal (nl) blood glucose - breakfast hyperglycema (h) hypoglycema (lo) BG-lun normal (nl) blood glucose - lunch hyperglycema (h) hypoglycema (lo) BG-dn normal (nl) hyperglycema (h) hypoglycema (lo) blood glucose - dnner

BG-bed PA-mor PA-aft PA-ng FI-bre FI-lun FI-dn FI-bed normal (nl) hyperglycema (h) hypoglycema (lo) sedentary (s) lght (l) heavy (h) sedentary (s) lght (l) heavy (h) sedentary (s) lght (l) heavy (h) yes (y) yes (y) yes (y) yes (y) blood glucose - bed physcal actvty - mornng physcal actvty - afternoon physcal actvty - nght food ntake - breakfast food ntake - lunch food ntake - dnner food ntake - bed Legend for Table 3. 3.0 Methods 3.1 Machne Learnng Algorthms Snce ts ntal development, the ID3 machne learnng algorthm has provded the bass for several decson tree learnng methods. Fgure 1 shows some of the well known algorthms that have evolved to reduce defcences n ID3 and to tackle ssues such as costs of carryng out tests. CLS 63 ACLS 81 AQ 69 ID5R 89 CART 84 OC1 94 ID3 79 C4.5 92 EG 91 CN2 88 CS-ID3 93 ICET 95 Fgure 1: Some well known decson tree learnng algorthms These algorthms have been appled n varous domans, and several expermental studes have evaluated ther accuracy usng benchmarkng data From the set of avalable algorthms, Murthy s OC1 [10] was chosen to represent the decson tree learnng algorthms famly, manly because t constructs trees usng a core algorthm that t nherts from ID3, and whch s a characterstc of most decson tree learnng algorthms. Secton 3.1.1 gves an overvew of the algorthm, and secton 3.1.2 presents the results obtaned. 3.1.1 Overvew of the OC1 Algorthm The man prncple behnd tree nducton algorthms s to develop a tree that s consstent wth the table of examples representng the tranng set. The core procedure to obtan such a tree can be summarsed as follows: If all the result values are the same, stop Otherwse Select the root from the avalable attrbutes Partton the table so that each sub-table's Root column has the same value. Apply the algorthm recursvely to fnd a decson tree for each of the sub-tables, gnorng the Root attrbute. The selecton of the root s usually based on a metrc that utlse nformaton theoretc measures [11].

Numerc attrbutes are parttoned by consderng all potentally useful axs parallel splts [12], or by adoptng lnear dvsons [13]. The OC1 algorthm provdes optons for utlsng a range of selecton metrcs, and allows the use of axs parallel splts, as well as a scheme for lnear splts. Ths type of process can produce deep trees that over traned, wth termnal nodes based on only a few examples. There are varous tree prunng methods that can be used to overcome ths problem [14]. The central operaton n post-prunng s to replace a subtree by a leaf node whch takes the value of the majorty class of the examples n the subtree provded t results n an mprovement n accuracy. OC1 adopts a technques, known as the mnmum costcomplexty prunng method [15]. Ths technque works n two stages. Frst t generates a seres of pruned trees. Ths s done by consderng the effect of replacng each subtree by a leaf and workng out the reducton n error per leaf (α): R( t) R( Tt ) α = Nt 1 where R(t)s the expected error rate f the subtree s pruned, R(T t ) s the expected error rate f the subtree s not pruned, and N t s the number of leaf nodes n the subtree. The subtree wth least value per leaf (smallest α) s pruned by replacng the subtree by ts majorty class, and the process repeated recursvely to generate a seres of ncreasngly pruned trees. Second, once a seres of ncreasngly pruned trees have been generated, t uses an ndependent testng set to select a tree that s the most accurate (more precsely, t selects the smallest tree wthn one standard error of the most accurate tree) 3.1.2 Results The OC1alogorthm was tested by applyng t on the tranng dataset and testng t on the 100 addtonal cases. It was run wth both an axs-parallel opton set, when t s smlar to C4.5, and wth the use of lnear dvsons. The Informaton gan measure was used for the selecton metrc, and all other parameters were set to ther default values (.e., they were not tuned n any way). The followng table summarses the results n terms of accuracy and sze of trees for the dfferent varatons. In terms of comprehensblty, the smple splts adopted by utlsng axs-parallel splts are easer to relate to the doman by an expert than the lnear equatons present when lnear dvsons are used. The use of lnear dvsons wth prunng results n the smallest tree, although t s the only tree that s not 100% correct. RESULTS BEFORE PRUNING Max Depth Leaves Accuracy Axs Parallel Verson 14 42 100% Lnear Dvson Verson 9 34 100% RESULTS AFTER PRUNING Axs Parallel Verson 11 32 100% Lnear Dvson 8 17 99% Cat 1 = 100% (8/8) Cat 2 = 97.92% (47/48) Cat 3 = 100% (37/37) Cat 4 = 100% (7/7) Table 4 Results 3.2 Neural Network Algorthm 3.2.1. The Generatng-Shrnkng Algorthm System Archtecture The descrpton here s based on [16. The algorthm uses a three layer feed-forward neural network. The frst, nput, layer () contans lnear neurons, the second, hdden, layer () contans sgma-p neurons, and the thrd, output, layer () contans lnear threshold neurons. The number of neurons n and equals the number of tranng patterns, and the number of neurons n equals the number of classes (n our case nsuln regmens) that the neural network has to dstngush. The frst layer has as nput an n-dmensonal pattern vector p R n to be classfed, along wth a fxed reference number r R. Together p and r they form an (n+1)-dmensonal vector (p,r) R n+1. The neurons' outputs n the thrd layer are bnary, so when the output of the th neuron s 1, the nput pattern belongs to the th class. In ths sense, the network output s defned to be the ordnal number of the neuron n the output layer whose output s 1. The nput-output assocatons of

each layer are descrbed by the followng three equatons: Input Layer: n + 1 j= 1 o = w p, j j 1, 2,...,n = (1) where o s the output of the th neuron n the nput layer, w j s the weght of the connecton between the jth nput component and the th neuron n the nput layer, n s the number of neurons n the nput layer and p' j s the jth component of p'. Hdden Layer: o = n f j= 1(j ) 1, 2,...,n = (2) where ( x) ( w o + w o ) j j 1, x > 0 f =, 0, x 0 o s the output of the th neuron n the hdden layer, w j s the weght of the connecton between the output of the jth neuron n the nput layer and the th neuron n the hdden layer and n s the number of neurons n the hdden layer., Output Layer: n o = f wj o j 0. 5, = 1, 2,...,n (3) j= 1 where o s the output of the th neuron n the output layer, w j s the weght of the connecton between the output of the jth neuron n the hdden layer and the th neuron n the output layer and n s the number of neurons n the output layer. 3.2.4. Neural Network Results At frst the Neural Network was not traned at all. The 646 cases were entered one by one as an nput to the NN. If the Neural Network's output was the same answer as the expert's decson the next case was entered n the NN. If not, the case was used as a new tranng pattern for the NN before we go on to the next one. In ths way the Neural Network had a lttle more "experence" before the processng of the next case. In ths way after the 1st pass of the 646 cases, the correct answers were 452 and the Neural Network had 194 tranng patterns (the wrong answers). The above process was repeated wth the cases that were correctly answered to check f the patters that were added would change the Neural Network's decson, untl the Neural Network was able to gve the correct answer for all cases. 1st pass 2nd pass 3rd pass 4th pass Number of cases 646 452 412 409 Correct answers 452 (70,0 %) 412 (91,1 %) 409 (99,2 %) 409 (100,0 %) Wrong answers 194 (30,0 %) 40 (8,9 %) 3 (0,8 %) 0 (0,0 %) Tranng patterns 194 234 237 237 Table 5: Tranng results Regmen Correct Wrong % 1 4 8 33% 2 30 6 83% 3 22 16 58% 4 2 12 14% Totals 58 42 58% Table 6: Evaluaton of the Neural Network

5. Concluson It was found that the decson tree learnng algorthm out-performed the neural network n the evaluaton phase. As a next step, the expert wll comment on the outcome of the two algorthms, n order to determne why the neural network dd not perform as expected. Bblography 1. Unger R, Foster D. (1992) Dabetes melltus. In: Wlson J, Foster D (eds). Wllams textbook of endocrnology, 8th ed. (Phladelpha, WB Saunders), 1255-1334. 2. Ambrosadou V., Gouls D.G., Pappas C. (1996) "Clncal Evaluaton of the DIABETES Expert System for Decson Support by Multple Regmen Insuln Dose Adjustment". Computer Methods and Programs n Bomedcne, Vol. 49, 105-115. 3. Ambrosadou V., Gogou G., Pappas C., Maglaveras N. (1996) "Decson Support for Insuln Regme Prescrpton Based on a Neural Network Approach". Journal of Medcal Informatcs, Taylor & Francs, Vol. 21, No.1, 23-34. 4. Ambrosadou V., Gouls D. (n press) "The DELPHI Method as a Consensus and Knowledge Acquston Tool for the Evaluaton of the DIABETES System for Insuln Admnstraton". Journal of Medcal Informatcs. 5. Luger G.F., Stubblefeld, W.A. (1998) Artfcal Intellgence Structures and Strateges for Complex Problem Solvng, Thrd Edton, Addson Wesley Longman. 6. Foster D.W. (1987) "Dabetes melltus", n Harrson's Prncples of Internal Medcne, 11th ed, Mc Graw-Hll, 1778-1797. 7. Garber A.J. (1994) "Dabetes melltus", n Sten J.H. (edtor), Internal Medcne, 4th ed, Mosby, 1391-1430. 8. Group of European Poltcs IDDM. (1993) "Instructons for the Treatment of Insuln- Dependent (type I) Dabetes- Consensus Gudelnes ". 9. Mller AS, BH Blott, TK Hames (1992) "Revew of neural network applcatons n medcal magng and sgnal processng", Med & Bol Eng & Comput, vol. 30, 449-464. 10. S. Murthy and S. Kasf and S. Salzberg (1994), A System for Inducton of Oblque Decson Trees, Journal of Artfcal Intellgence Research, 2, 1-32 11. Vadera, S. and Nechab, S. (1994), ID3, ts Chldren and ther Safety }, BCS Specalst Group on Expert Systems Newsletter 31,11-21 12. Qunlan, J.R. (1992), C4.5 : A Program for Machne Learnng, Calforna:Morgan Kauffman 13. Vadera, S. and Nechab, S. (1996), Inducng Safer Trees, Proceedngs of the Internatonal Conference on Knowledge Based Computng Systems, Narosa Publshng, pp205-216. 14. Breslow, L.A. and Aha, D.W. (1997), Smplfyng decson trees: A survey, Knowledge Engneerng Revew, 12,1-40. 15. L. Breman, L. and Fredman, J.H. and Olsen, R.A. and Stone, C. J. (1984), Classfcaton and Regresson Trees, Belmont: Wadsworth 16. Yan Qu Chen, Davd W., Thomas, and Mark S. Nxon. (1994) "Generatng-Shrnkng Algorthm for Learnng Arbtrary Classfcaton". Neural Networks Vol. 7, No. 9, 1477-1489.