Subsampling for Efficient and Effective Unsupervised Outlier Detection Ensembles

Size: px
Start display at page:

Download "Subsampling for Efficient and Effective Unsupervised Outlier Detection Ensembles"

Transcription

1 Subsampling for Efficient an Effective Unsupervise Outlier Detection Ensembles Arthur Zime, Matthew Gauet, Ricaro J. G. B. Campello, Jörg Saner Department of Computing Science, University of Alberta, Emonton, AB, Canaa ABSTRACT Outlier etection an ensemble learning are well establishe research irections in ata mining yet the application of ensemble techniques to outlier etection has been rarely stuie. Here, we propose an stuy subsampling as a technique to inuce iversity among iniviual outlier etectors. We show analytically an experimentally that an outlier etector base on a subsample per se, besies inucing iversity, can, uner certain conitions, alreay improve upon the results of the same outlier etector on the complete ataset. Builing an ensemble on top of several subsamples is further improving the results. While in the literature so far the intuition that ensembles improve over single outlier etectors has just been transferre from the classification literature, here we also justify analytically why ensembles are also expecte to wor in the unsupervise area of outlier etection. As a sie effect, running an ensemble of several outlier etectors on subsamples of the ataset is more efficient than ensembles base on other means of introucing iversity an, epening on the sample rate an the size of the ensemble, can be even more efficient than just the single outlier etector on the complete ata. Categories an Subject Descriptors H.2.8 [Database Applications]: Data mining Keywors outlier etection; ensemble. INTRODUCTION An outlier is an observation (or subset of observations) which appears to be inconsistent with the remainer of that This wor was one while the author was on leave of absence from Luwig-Maximilians-Universität München, Germany. This wor was one while the author was on sabbatical leave from University of São Paulo, São Carlos, Brazil. Permission to mae igital or har copies of all or part of this wor for personal or classroom use is grante without fee provie that copies are not mae or istribute for profit or commercial avantage an that copies bear this notice an the full citation on the first page. Copyrights for components of this wor owne by others than ACM must be honore. Abstracting with creit is permitte. To copy otherwise, or republish, to post on servers or to reistribute to lists, requires prior specific permission an/or a fee. Request permissions from permissions@acm.org. KDD 3, August 4, 203, Chicago, Illinois, USA. Copyright 203 ACM /3/08...$5.00. set of ata [6]. Detecting outliers is an important tas in many practical applications. Some applications of outlier etection, such as etecting measurement errors, are mostly concerne with removing the outliers from the ata as a form of noise. Other applications, such as creit car abuse etection, or the ientification of unusual measurements in scientific ata, are concerne with fining outliers because their eviating behavior from the rest of the ata may require specific actions or provie opportunities for new insights. Various approaches to outlier etection have been propose, base on ifferent notions of outliers, or targete towars specific applications that require the ientification of outliers. Here, we are intereste in unsupervise, nonparametric outlier etection methos that assign a score to each ata object an thus allow a raning of objects accoring to their egree of outlierness. Parametric, statistical approaches [6, 35] fit certain istributions to the ata by estimating the parameters of these istributions from the given ata. A problem with these approaches is that istribution parameters such as mean, stanar eviation, an covariances are rather sensitive to the presence of outliers. Possible effects of outliers on the parameter estimation have been terme masing an swamping. Outliers can mas their own presence by influencing the values of the istribution parameters (resulting in false negatives), or swamp inliers to appear as outlying ue to the influence parameters (resulting in false positives) [6, 9]. Non-parametric approaches o not assume a specific istribution of the ata, but estimate (explicitly or implicitly) certain aspects of the probability ensity. Non-parametric methos inclue the well-nown istance-base an ensity-base methos. Both istance-base an ensitybase methos basically aim at proviing a rather simple estimate of the ensity aroun points, which can be seen as an approximation of statistical ernel ensity estimates. Distance-base methos such as DB-outlier [25] an its variants are base on the nearest neighbor (NN) istances [5, 34], trying to fin so-calle global outliers as points that are, roughly speaing, far away from the rest of the ata. Density-base methos such as LOF [0] an its variants try to fin so-calle local outliers as points that are, roughly speaing, locate in an area of relative low ensity compare to their NN (intene to inicate points that are outliers with respect to the nearest moe in the ata istribution). The ensity aroun points in these methos is also estimate base on NN istances. One problem with istance-base an ensity-base methos is that they can

2 also suffer from effects similar to masing an swamping, ue to the simplicity of (an thus error in) the ensity estimates. Another problem is the typically high runtime of these approaches, ue to the fact that their computation inclues at least fining the NN of each ata point (resulting in an at least quaratic complexity w.r.t. the atabase size). In this paper, we aress both problems of istance-base an ensity-base methos. We propose an stuy a general approach to improve both the quality an the performance of such outlier etection methos by combining into an ensemble results of a base metho on subsamples of the ata. Previous wor on outlier ensembles is very limite an only shows empirically that ensembles of outlier etectors have the potential to improve the quality, compare to that of their base methos [30, 36], at an increase runtime cost. Our wor is novel an avances the area of outlier etection in the following respects: We argue theoretically an emonstrate empirically that it is possible to construct ensemble members for outlier etection methos which perform iniviually alreay better than the base metho, in general. Combining those outlier etectors into an ensemble reners the performance gain not only more robust but can improve the performance even further. At the same time, when using small sample sizes for the ensemble members, we can gain consierable spee-up in runtime compare to running a stanar ensemble an, for small ensemble sizes, even compare to running the base metho on the whole ata set. The propose principle is funamental an flexible. It oes not rely on specific ata types. It can be combine with various conventional outlier etection techniques. The rest of the paper is organize as follows: We iscuss relate wor on outlier etection an ensembles for outlier etection (Section 2). We provie theoretical reasoning to support outlier etection ensembles in general an the claime properties of our metho in particular (Section 3). We provie experimental results to support our claims empirically (Section 4). We conclue the paper in Section RELATED WORK The istance-base notion of outliers (DB-outlier) [25] was the first atabase-oriente approach in the area of unsupervise outlier etection, which initiate a new line of research on this topic in the ata mining community. Variants of DBoutliers consier the istances to the nearest neighbors of each object an use these istances to ran the objects [34], or, they use the sum of istances to all points within the set of NN (calle the weight ) as an outlier egree [5]. These methos are also calle global methos in that the compute outlier scores represent global ensity scores for each point. The so-calle local methos, e.g. LOF [0], consier instea local ensity scores, which are ratios between the ensity aroun an object an the ensity aroun its neighboring objects. Variants of the local outlier moel inclue LoOP [27], an LOCI [33]. Also the istance-base metho LDOF [44] is relate in reasoning about local comparisons. It has been shown recently [37], however, that the ifferentiation between global an local methos is not strictly ichotomous but that there are egrees of locality. Much research has aime at improving the efficiency of unsupervise outlier etection by algorithmic techniques, for example base on approximations or improve pruning techniques for mining the top-n outliers [4, 7, 22, 23, 26, 42]. An analysis of such efficiency improving techniques for outlier etection algorithms has been provie by Orair et al. [32]. These techniques, however, o not aim at improving the approximations of the unerlying statistical notion of outlierness. They only approximate a specific algorithmic moel. Ensemble techniques, on the other han, have the potential to improve the performance of their components in terms of the quality of the etecte outliers, rather than in terms of runtime (but we will show in this paper that it is even possible to gain performance improvements when constructing certain types of outlier ensembles). The first approach to improve outlier etection by ensemble techniques, base on feature bagging, was propose by Lazarevic an Kumar [30], combining ifferent results of the same algorithm (namely LOF [0]) applie to ifferent, ranomly selecte feature subsets. Feature bagging is a common proceure to inuce iversity of ensemble members in ensemble classification [] or ensemble clustering [8, 4, 40]. Subsequent research on outlier etection ensembles focuse on the issue of comparability of scores for score combinations, using Sigmoi functions an mixture moeling to fit outlier scores, provie by ifferent etectors, into comparable probability values [7], or scaling by stanar eviation [3], or statistical reasoning about score istributions [28], enabling the combination of ifferent outlier etection methos into one ensemble. Schubert et al. [36] propose a similarity measure to appropriately compare ifferent outlier ranings (base on scores) an to allow for the assessment of the iversity of ifferent outlier etectors. As an application, they propose a greey ensemble approach, emonstrating the importance of iversity for the performance of an ensemble. In all these papers, although outlier etection ensembles have been iscusse an improve, no new metho of inucing iversity has been pursue. Except for feature bagging [30], all other existing ensemble methos for outlier etection [7, 28, 3, 36] are metamethos an coul be use on top of our sample-base metho (or on top of feature bagging, as in [28,3,36]). They o not propose original means to inuce iversity when using a selecte base outlier etection metho. In general, while the motivation for ensemble methos for outlier etection is borrowe from the rich traition in the literature on supervise ensemble learning [,2,2,4], the theoretical founation for ensemble learning in the unsupervise setting is far less mature. The same hols true not only for outlier etection ensembles but also for clustering ensembles espite the far more abunant literature on practical approaches in that area [8]. Although the problem setting is consierably ifferent, let us finally note that sampling has been use in ensemble clustering to inuce iversity. Different subsamples of the ata set have been clustere an the resulting clusterings were combine into a consensus clustering [3, 6, 20, 39]. 3. OUTLIER DETECTION ENSEMBLES BASED ON SUBAMPLING In this section, we will iscuss the potential benefits of using outlier etection ensembles base on subsampling. Previous approaches using ensemble learning for outlier etection [7, 28, 30, 3, 36] transferre techniques without any theoretical founation of why, what has a clear theoret-

3 ical bacgroun in supervise learning, shoul also wor in unsupervise outlier etection. Such a view can be loosely argue for when we consier outlier etection methos as classifiers. When assuming that a threshol on outlier scores is use to istinguish between outliers an inliers, we can view the outlier metho as classifying all objects into one of these two classes: outliers an inliers even though, no labels are use in the training phase when the moel (raning) is built. If we succee to construct iverse enough outlier etectors for the same ata set, we can hope to improve the overall performance over the iniviual members by combining them into an ensemble. The generic argument given is that all the ensemble members are committing errors but on ifferent cases, if the members are inepenent, i.e., iverse, or, in other wors, if the errors are uncorrelate. While such a generic view may potentially explain some of the performance gains, we will show in the following subsections that there are more specific reasons for why (uner some general assumptions) an ensemble of outlier etection methos can improve the performance over its iniviual members. 3. Benefits of Ensembles for Outlier Detection Base on Density Estimates In this paper, we are focusing on istance-base an ensity-base outlier etection methos, which, as iscusse in the introuction, compute outlier scores that are base, implicitly or explicitly, on some form of ensity estimates. One can view these methos as trying to ientify the outliers in a given ata set X with respect to an unnown probability ensity f, which represents the process that has generate the majority of the ata set (at least the inliers). The ata set X itself can be viewe as a sample rawn from the true, but unnown unerlying ensity istribution, an the methos try to estimate the ensity f(x) aroun points x using a more or less rough ensity estimate ˆf X(x) (in orer to compute outlier scores in some way). Assuming the correctness of the unerlying outlier moel of the methos, it is clear that the quality of a metho s result epens on the quality of the ensity estimate ˆf X(x) an that the results will improve if the estimate can be improve. For this case, we can show formally that a iverse ensemble of such outlier etectors oes in fact show an improve expecte performance over the iniviual ensemble members, uner some general conitions. Given a true, smooth p..f. f(x) an a ata set X, we can express an estimate ˆf X(x) of f(x) base on X as: ˆf X(x) = f(x) + v X(x) where v X(x) is a ranom variable escribing the error of the estimate ue to the finite sample. The quality of the estimate ˆf of f ecies over success an failure of the outlier etection. However, the ensity estimates use by the consiere outlier etection algorithms may not be reliable an stable in all regions of the ata space, ue to the natural intrinsic ranomness associate with a single sample that the ata set represents. If we are able to obtain multiple ensity estimates for each point x (e.g., as we propose via subsamples), we can obtain more reliable an stable ensity estimates by averaging the multiple ensity estimates for each point. The rationale for this is the following: The output of outlier methos is a raning of all points x in terms of outlier scores that, in essence, epens on the raning of the points accoring to ˆf X(x). Ieally, we want a raning of the points x accoring to f(x). If we have multiple ensity estimates for each point that we average, we can consier the estimate itself as a ranom variable an averaging these estimates for each point gives us the expectation of this variable as: E{ ˆf X(x)} = E{f(x)} + E{v X(x)} = f(x) + E{v X(x)} In this formulation, one can clearly see that the raning of objects w.r.t. E{ ˆf X(x)} is the same as the raning w.r.t. the true ensity f(x) (the ieal raning ), if just the expectation of the error v X(x) in the iniviual estimates is the same for every point x. This is obviously the case when the ranom variable that escribes the error woul not epen on x, in which case E{v X(x)} = E{v X} = µ vx, but one woul also obtain the ieal raning when the error is not inepenent on x; for instance, when the error woul vary between points but the expectation is the same for each point, we woul also have the same raning. We can even obtain the same raning as the ieal raning if the expectations E{v X(x )} an E{v X(x 2)} iffer for two points x an x 2, as long as the ifference oes not cause an inversion between the actual rans E{ ˆf X(x )} an E{ ˆf X(x 2)}, respectively. Furthermore, if we consier that for successful outlier etection, the methos only have to istinguish between outliers an inliers, we can even allow inversions between rans, as long as ran inversions occur only within outliers or within inliers. Only a ran inversion between an outlier an an inlier woul be problematic. In the next subsection, we will argue that for the propose ensemble technique using subsamples, the expectation of the error in the ensity estimate E{v X(x)} oes epen on the location x an its surrouning ensity, but that the metho has the esirable property that it can increase the gap in rans between the outliers an the inliers, maing inversions in ran between these groups of points even less liely. 3.2 Aitional Benefits of Subsampling Subsampling is theoretically well suite to introuce iversity into an ensemble of otherwise ientical istance-base or ensity-base outlier etection methos. Every member of the ensemble will etermine the outlier score of every object in the atabase, but only using a small subset of the ata to estimate the ensity aroun points. Learning ensity estimates for outlier etection on smaller samples can actually improve the etection rate of outliers, compare to learning these estimates on the whole ata set that conceptually represents just a somewhat larger sample of an unnown istribution f. We will see in the empirical evaluation that in practice, surprisingly small sample sizes (such as 20% or in many cases even just 0%) are typically not leaing to a eteriorate but to a consierably improve quality of the outlier etection for a sample-base ensemble of outlier etectors. One reason for the improve performance of an ensemble is, as expecte, just the combination of the results of multiple outlier etectors. Compare to using the ataset as the only sample rawn from f, rawing multiple subsamples X from this sample can minimize the effect of the ranomness associate with a single sample. Note that averaging the scores to buil an ensemble has been, heuristically, common practice [7, 28, 30, 3, 36], but now it fins also a theoretical justification.

4 Another, more interesting reason for the improve performance is that the base metho applie to a smaller subsample of a given ata often shows an improve outlier etection rate, compare to the same metho applie to the whole ata set. As we will argue formally in the following, this is ue to the fact that istance-base an ensity-base methos are essentially using simple (not volume normalize) nearest neighbor istances to estimate ensity. To unerstan the effects of sample base nearest neighbor istances, consier a sphere of raius r in a -imensional Eucliean space, containing n ata points uniformly istribute within the sphere. The expecte Eucliean istance from a point to its nearest neighbour (NN) is given by [9]: ( ) E{ } = r () n For a given ata set, let r be a constant value small enough so that, for two spheres having the same raius r but lying on ifferent positions of the ata space, the ata points within both spheres are approximately uniformly istribute. Now, suppose that the number of ata points within each of these spheres is ifferent, given by n an n 2 (n n 2), which means that the ensities of the ata in the respective regions of the space are ifferent (as their volumes are the same). For example, one sphere might be locate insie a ense cluster, whereas the other one might lie on a sparse area containing bacgroun noise. Then, it follows from () that the expecte NN istances in the corresponing regions of the space are given by: ( ) ( ) E{ } = r ; E{ } = r (2) n n 2 If one ranomly removes a fraction m of the ata objects with equal probability, the expecte number of remaining objects within those two spheres are given by n m an n 2m, respectively. In this case, the expecte NN istances become: ( ) ( ) E{ } = r ; E{ } = r (3) n m n 2m The ifference in the expecte istances are therefore: ( ) ( ) ( ) ( ) m = r r = r (4) n m n n m ( ) ( ) ( ) ( ) m 2 = r r = r n 2m n 2 n 2 m In relative terms, if we ivie an 2 by the original expecte istances (for the full ataset, i.e., before the subsampling), we get: ( ) 2 m ( ) = ( ) = (6) r n r m n 2 The result in (6) says that the expecte NN istances within the spheres increase proportionally as a function of the subsampling rate m. This result reflects the intuition that, in relative terms, the contrast between the ensities of the spheres is ept constant, which justifies the use of a (5) Expecte NN Distances Fraction of Data (m) Figure : Behaviour of the expecte 5-NN istances for two spheres with raius r =, in a 2D Eucliean space, containing 000m (circles) an 00m (triangles) objects uniformly istribute (m is a fraction of the ata). subsampling proceure with even sampling probabilities. In an ensemble setting, for instance, this means that one can get multiple (sub)samples that exhibit variability (iversity) in terms of their observations, but eep the same expecte ensity profile as the full ataset. The above result is important but it oes not explain all implications of subsampling when using unnormalize nearest neighbor istances. In absolute terms, Equations (4) an (5) tell us that the expecte ifference in the NN istances will be greater for a less ense sphere, i.e., > 2 if n < n 2. This means that the expecte NN istances iverge in absolute terms when the ata are ownsample to a fraction m of their original size. In other wors, the absolute ifferences between the expecte NN istances in areas of ifferent ensities ten to increase as a function of the subsampling rate. This effect is illustrate in Figure for r =, = 2, = 5, n = 00, n 2 = 000, an m ranging from 0. to. Such an effect can be beneficial for outlier etection, since it can mae it easier to istinguish between outliers an inliers. Particularly when also using an ensemble as iscusse above, the gap in the rans between outliers an inliers can increase, maing inversion of rans between these two groups less liely. 3.3 Metho an Complexity Note that the implementation of our proposal is not as simple as to tae subsamples an then run the outlier etection algorithms on these subsamples. This way we woul very liely completely miss information on the outlierness of many objects that are not containe in any subsample, an many objects woul get scores only from some of the subsamples. Instea, for each ensemble member, we raw a subsample from the atabase an compute the neighborhoo of each object in the atabase base on the subsample. This way, using subsample-base ensembles can also lea to a consierable spee-up, compare to other types of ensembles an, for small subsamples an ensemble sizes, even compare to running the base metho on the whole ata set. We will emonstrate in the experimental evaluation that sample sizes small enough to achieve substantial runtime improvements are goo choices in practice, leaing to goo outlier etection rates. In this subsection, we show the expecte runtime improvements by stuying the theoretical complexities.

5 While other ensemble methos require a multiple of the computing time compare to the base learner, the theoretical behaviour of a subsample base ensemble is faster (an requires less resources) than other types of ensembles. The typical complexity of a base metho is O(n 2 ), ue to the require NN queries over a atabase of n objects. The runtime of a stanar ensemble such as feature bagging is essentially s times the runtime of the base metho, where s is a factor that is etermine by the number of base learners use in the ensemble (i.e., the size of the ensemble). This factor is reuce in the case of feature bagging. Using only a subset of the imensions maes iniviual istance computations faster by some constant factor. For sample base ensembles, on the other han, the complete ensemble can even be faster than the base metho on the complete ataset, because of the quaratic runtime in n of the base metho. While the base metho requires NN queries for each object on the complete atabase (hence O(n 2 )), using a subsample of size m n, 0 < m <, reuces this to O(n 2 m). The runtime of a sample base ensemble is essentially s times the runtime of the base metho, using a much smaller ata set for the neighborhoo computation. For an ensemble size of 0 base learners an sample size of 0%, the sample-base ensemble woul require roughly the same runtime than a single base metho on the full ataset but 0 times less time than an ensemble with the same number s of ensemble members base on other means of iversity. For larger ensembles, the ensemble requires only a small multiple of the base metho but still only 0% (or the equivalent of the sample size m) of a stanar ensemble. For example, if we use 25 ensemble members an sample size 0%, the ensemble will require roughly 2.5 times the runtime of the base metho. 4. EVALUATION 4. Methos an Parameters For the reasons iscusse in Section 2, the canonical competitor is feature bagging (FB) [30]. As base methos we use LOF [0], LDOF [44], an LoOP [27]. For the setup of experiments, we have to consier various parameters. For both ensemble methos (feature bagging an subsampling), we choose a fixe number of 25 ensemble members. We follow the original setup of the feature bagging metho, combining the scores of the ensemble members by computing the average. For the subsampling, we consier various sample sizes. Each of the base methos requires a size of the neighborhoo. Hence we will show experimental results (i) with a fixe choice of an varying sample size; (ii) with a fixe sample size, varying ; an (iii) with fixe choices of an sample size, comparing ifferent base methos. When we fix, we choose a value that gives a reasonable result quality (i.e., better than ranom) for the base metho an compare that to the ensemble variants. Finally (iv), for the synthetic ataset collections, where the iniviual atasets follow the same general characteristics, we show an average behaviour over all atasets of the collection. We report the area uner the receiver operating characteristic curve (), which plots the true positive rate vs. the false positive rate, a common measure for evaluation of outlier etection methos [7, 28, 30, 3, 36]. The experiments are performe using ELKI [2, 3]. 4.2 Datasets For a statistical assessment, we generate two inepenent sets of 30 synthetic atasets (batch an batch2). For each ataset, we choose ranomly values for the following parameters in the given range: imensionality [20,..., 40], number of clusters c [2,..., 0], for each cluster inepenently the number of points n ci [600,..., 000]. For each cluster, the points are generate following a Gaussian moel as follows: For each cluster c i, an each attribute a, we choose a mean µ ci,a from a uniform istribution in [ 0, 0] an a stanar eviation σ ci,a from a uniform istribution in [0., ]. Then for the cluster c i, n ci cluster objects (points) are generate attribute-wise by the Gaussians N (µ ci,a, σ ci,a). The resulting cluster is rotate by a series of ranom rotations an the covariance matrix Σ corresponing to the theoretical moel is compute by the corresponing matrix operations [38]. Then, we compute for each point the Mahalanobis istance to its corresponing cluster center, using the covariance matrix Σ of the cluster. For a ataset imensionality, the Mahalanobis istances for each cluster follow a χ 2 istribution with egrees of freeom. We label as outliers those points that exhibit a istance to their cluster center larger than the theoretical 75 quantile, inepenently of the actually occurring Mahalanobis istances of the sample points. This results in an expecte amount of 2.5% outliers per ataset. As real atasets we use the atasets Satimage, Lymphography, an Segment (use also by Lazarevic an Kumar [30]). Aitionally, we chose from the UCI machine learning repository [5]: Wisconsin breast cancer (WBC) an Waveform Database Generator (waveform). While Lazarevic an Kumar consier outlier etection as equivalent to rare class etection, we argue that outliers are boun to be rare, but objects of a rare class are not necessarily outliers. Therefore, we use a ifferent preprocessing for some of the atasets: For Satimage, we combine train an test set an transforme the ataset to an outlier tas by taing a sample of 0% from class 2, evaluating the ownsample class as outliers vs. the rest. 2 For Lymphography, we merge the small classes &4 as outliers vs. the rest. For Segment, we chose classes GRASS, PATH, an SKY for ownsampling, in turn, to 0%, which reners the remaining objects of these classes outliers (resulting in three ifferent atasets). For the atasets WBC an waveform we also select a meaningful outlier class for ownsampling ( malignant, an 0, respectively). With this metho of using classification ata for evaluation of outlier etection methos we are conform with the literature [, 24, 29, 43, 44]. Overall, this results in 60 synthetic an 7 real ata sets. 4.3 Efficiency For a fair comparison, we use a preprocessing of the neighborhoo computation for all methos on equal terms, as facilitate by the framewor ELKI [2]. As in our experiments we use 25 ensemble members, we stuy the runtime of a typical base metho (LOF), the subsampling ensemble (0% sample size) an feature bagging, when scaling the number of objects in the atabase. As emonstrate in Figure 2, 2 Lazarevic an Kumar use the smallest class 4 as outlier vs. rest, but this is an example where the rare class oes not constitute outliers, as the classes 3-7 are all very similar. Accoringly, they report performance very close to a ranom result on this ataset.

6 Time (s) feature bagging ensemble subsampling ensemble base metho (LOF) Instances in ataset Figure 2: Runtime of LOF, subsampling ensemble, an feature bagging when increasing atabase size no. ensemble members Figure 3: Quality with increasing ensemble size. the subsampling ensemble is close to the base metho while feature bagging requires a multiple of the runtime. As iscusse in Section 3.3, the efficiency epens on the sample size an on the ensemble size. We o not evaluate the ensemble size further, let us just consier an example on one of the synthetic atasets to stuy the behaviour with aing more ensemble members (Figure 3). We see a strong increase in quality between 2 an 0 ensemble members, then, up to 25 ensemble members, the quality increases further, steaily but slowly. This improve performance comes at moerate runtime cost. Nevertheless, we fix the ensemble size to 25 in the following experiments. 4.4 Effectiveness For illustration of results with variances we use box plots where the box extens from the lower to upper quartile values of the ata, with a line at the meian. The whisers exten from the box to show the range of the ata. The length of the whisers exten to the most extreme ata point within.5*(75%-25%) ata range. Occasionally occurring single ata points beyon that range are plotte as flier points past the en of the whisers. Note however that the source of variance in the plots will iffer: in synthetic ata, we give the istribution over the 30 atasets, in real ata, we give the istribution over the iniviual ensemble members. Synthetic Data. First, we show as a statistical assessment the results of the subsample-base ensemble over all the synthetic atasets of batch. Here the box plots visualize the istribution of the results for the same sample size, the same base metho, an the same parametrization of the base metho for all atasets in the batch for the subsampling ensemble, the base metho (sample size ), an the feature bagging ensemble (FB). Figure 4 shows examples for a fixe = 3 for the base methos LDOF, LOF, an LoOP. The behaviour on batch2 (not shown) follows the same general FB (a) LDOF, = FB (b) LOF, = FB (c) LoOP, = 3 Figure 4: for ensembles ifferent sample sizes as well as feature bagging (FB) an base metho (sample size=), on the 30 atasets of batch. pattern. We varie from 2 to 0 an got similar results. The smaller sample size leas to larger improvements. Real Data. Having shown the ensemble performances over a set of 30 atasets for the synthetic ata, we now analyze the behaviour on iniviual real atasets. Here, we show in the whiser plots the variance in the achieve by the iniviual ensemble members base on subsamples of ifferent sample size (zero variance for sample size, which reflects the performance of the eterministic base metho on the complete ata), an feature bagging (FB). The ROC AUC of the ensembles (subsampling an feature bagging) are visualize by a iamon. Figures 5, 6, an 7 show the results for the three base methos on the atasets Lymphography, WBC, an Satimage-2, respectively. We choose the same for all base methos such that at least some of the base methos get reasonable results. For the larger ataset satimage-2, the nees to be larger as well. Comparing these plots, we see a ifferent behaviour of the base methos as some atasets are easy for some base methos while some other atasets are relatively har. In particular, LDOF oes not retrieve sensible results on all three atasets. In all cases, however, the subsampling ensemble improves. Feature bagging oes

7 FB (a) LDOF, = FB (a) LDOF, = FB (b) LOF, = FB (b) LOF, = FB (c) LoOP, = FB (c) LoOP, = 2 Figure 5: for ensemble members of the subsampling ensemble for ifferent sample sizes (boxes), the base metho (sample size=), an ensembles (iamons) on top of subsamples an feature bags (FB) on ataset Lymphography. Figure 6: for ensemble members of the subsampling ensemble for ifferent sample sizes (boxes), the base metho (sample size=), an ensembles (iamons) on top of subsamples an feature bags (FB) on ataset WBC. not perform always that convincingly, in some cases it rops to (or below) ranom quality. Only for LDOF an LoOP on Lymphography (Figures 5(a), 5(c)), feature bagging can recover from the wea performance of the base learner. As a general picture from these an other results, we see that the smaller sample size actually has the larger potential of improvement. Although the smaller sample eeps not as much information about the ataset (an the unnown unerlying ensity-istribution), from the point of view of ensemble learning, these finings mae sense, as the smaller samples will actually provie the most iverse ensemble members, an it also shows the practical applicability of the reasoning we provie in Section 3.2. In most cases, we fin the 0%-sample to wor best. However, the brea-even point between too much loss of information an too high similarity of ensemble members iffers from ataset to ataset. We have also examples where the 0%-sample is alreay too small such as in Figure 5(a). That is possibly relate to the fact that the lymphography ata are relatively small. However, we fix the sample size to 0. for the following experiments an explore the behaviour of base metho, subsampling ensemble an feature bagging ensemble over a range of. We see, as an example, in Figure 8, a slight but steay increase of the with for the base methos an the subsampling ensemble while the feature bagging ensemble appears to be much more instable. While increasing oes not, in general, increase the quality of the results, we observe the same pattern of stability of the base metho an the subsampling ensemble an higher variance of the feature bagging ensemble on other atasets as well. For the three atasets base on segment, for = 20 (again a selection that gives reasonable results for most of the base methos), we show results for all three base methos in Figure 9. Again, the subsampling ensemble compares favourably against the base metho as well as against feature bagging. 5. CONCLUSION Although we compare the sample-base ensemble against feature bagging [30], let us finally note that these two approaches are not strictly competitors. Feature bagging is liely to be an interesting approach in the context of very

8 FB (a) LDOF, = FB (b) LOF, = FB (c) LoOP, = 50 Figure 7: for ensemble members of the subsampling ensemble for ifferent sample sizes (boxes), the base metho (sample size=), an ensembles (iamons) on top of subsamples an feature bags (FB) on ataset Satimage-2. high-imensional ata [45]. Sampling shoul be helpful when the atasets are growing too large. On the other han, feature bagging is not meaningful for low-imensional ata, as the ensemble members are boun to be too similar. An sampling on too small ata is probably not too promising. However, these two problems (too small atasets with only a few imensions) are not really problems of toays research. It might be an interesting question for future wor to investigate the integration of both techniques, builing ensembles on subsets of features an subsets of ata objects simultaneously. Acnowlegments This wor has been partially supporte by NSERC (Canaa), FAPESP (Brazil), an CNPq (Brazil). 6. REFERENCES [] N. Abe, B. Zarozny, an J. Langfor. Outlier etection by active learning. In Proc. KDD, pages , Subsampling Ensemble LDOF Feature Bagging Ensemble (a) LDOF, m = 0. Subsampling Ensemble LOF Feature Bagging Ensemble (b) LOF, m = 0. Subsampling Ensemble LOOP Feature Bagging Ensemble (c) LoOP, m = 0. Figure 8: for base methos an corresponing ensembles varying on ataset waveform. segment-sky segment-path segment-grass KNN KNNW LDOF LOF LOOP LDOF LOF LOOP LDOF LOF LOOP LDOF LOF LOOP Base Subsampling FB Figure 9: for all methos, = 20, on ifferent atasets (variants of segment). [2] E. Achtert, S. Golhofer, H.-P. Kriegel, E. Schubert, an A. Zime. Evaluation of clusterings metrics an visual support. In Proc. ICDE, pages , 202. [3] E. Achtert, H.-P. Kriegel, E. Schubert, an A. Zime. Interactive ata mining with 3-parallel-coorinate-trees. In Proc. SIGMOD, 203. [4] F. Angiulli an F. Fassetti. DOLPHIN: an efficient algorithm for mining istance-base outliers in very large atasets. ACM TKDD, 3():4: 57, [5] F. Angiulli an C. Pizzuti. Fast outlier etection in high imensional spaces. In Proc. PKDD, pages 5 26, 2002.

9 [6] V. Barnett an T. Lewis. Outliers in Statistical Data. John Wiley&Sons, 3r eition, 994. [7] S. D. Bay an M. Schwabacher. Mining istance-base outliers in near linear time with ranomization an a simple pruning rule. In Proc. KDD, pages 29 38, [8] A. Bertoni an G. Valentini. Ensembles base on ranom projections to improve the accuracy of clustering algorithms. In WIRN / NAIS, pages 3 37, [9] M. M. Breunig, H.-P. Kriegel, P. Kröger, an J. Saner. Data Bubbles: Quality preserving performance boosting for hierarchical clustering. In Proc. SIGMOD, pages 79 90, 200. [0] M. M. Breunig, H.-P. Kriegel, R. Ng, an J. Saner. LOF: Ientifying ensity-base local outliers. In Proc. SIGMOD, pages 93 04, [] G. Brown, J. Wyatt, R. Harris, an X. Yao. Diversity creation methos: a survey an categorisation. Information Fusion, 6:5 20, [2] T. G. Dietterich. Ensemble methos in machine learning. In Proc. MCS, pages 5, [3] S. Duoit an J. Frilyan. Bagging to improve the accuracy of a clustering proceure. Bioinformatics, 9(9): , [4] X. Z. Fern an C. E. Broley. Ranom projection for high imensional ata clustering: A cluster ensemble approach. In Proc. ICML, pages 86 93, [5] A. Fran an A. Asuncion. UCI machine learning repository [6] A. L. N. Fre an A. K. Jain. Robust ata clustering. In Proc. CVPR, pages 28 36, [7] J. Gao an P.-N. Tan. Converting output scores from outlier etection algorithms into probability estimates. In Proc. ICDM, pages 22 22, [8] J. Ghosh an A. Acharya. Cluster ensembles. WIREs DMKD, (4):305 35, 20. [9] A. S. Hai, A. H. M. Rahmatullah Imon, an M. Werner. Detection of outliers. WIREs Comp. Stat., ():57 70, [20] S. T. Hajitoorov, L. I. Kuncheva, an L. P. Toorova. Moerate iversity for better cluster ensembles. Information Fusion, 7(3): , [2] L. K. Hansen an P. Salamon. Neural networ ensembles. IEEE TPAMI, 2(0):993 00, 990. [22] W. Jin, A. Tung, an J. Han. Mining top-n local outliers in large atabases. In Proc. KDD, pages , 200. [23] W. Jin, A. K. H. Tung, J. Han, an W. Wang. Raning outliers using symmetric neighborhoo relationship. In Proc. PAKDD, pages , [24] F. Keller, E. Müller, an K. Böhm. HiCS: high contrast subspaces for ensity-base outlier raning. In Proc. ICDE, 202. [25] E. M. Knorr an R. T. Ng. A unifie notion of outliers: Properties an computation. In Proc. KDD, pages , 997. [26] G. Kollios, D. Gunopulos, N. Kouas, an S. Berchthol. Efficient biase sampling for approximate clustering an outlier etection in large atasets. IEEE TKDE, 5(5):70 87, [27] H.-P. Kriegel, P. Kröger, E. Schubert, an A. Zime. LoOP: local outlier probabilities. In Proc. CIKM, pages , [28] H.-P. Kriegel, P. Kröger, E. Schubert, an A. Zime. Interpreting an unifying outlier scores. In Proc. SDM, pages 3 24, 20. [29] H.-P. Kriegel, M. Schubert, an A. Zime. Angle-base outlier etection in high-imensional ata. In Proc. KDD, pages , [30] A. Lazarevic an V. Kumar. Feature bagging for outlier etection. In Proc. KDD, pages 57 66, [3] H. V. Nguyen, H. H. Ang, an V. Gopalrishnan. Mining outliers with ensemble of heterogeneous etectors on ranom subspaces. In Proc. DASFAA, pages , 200. [32] G. H. Orair, C. Teixeira, Y. Wang, W. Meira Jr., an S. Parthasarathy. Distance-base outlier etection: Consoliation an renewe bearing. PVLDB, 3(2): , 200. [33] S. Papaimitriou, H. Kitagawa, P. Gibbons, an C. Faloutsos. LOCI: Fast outlier etection using the local correlation integral. In Proc. ICDE, pages , [34] S. Ramaswamy, R. Rastogi, an K. Shim. Efficient algorithms for mining outliers from large ata sets. In Proc. SIGMOD, pages , [35] P. J. Rousseeuw an M. Hubert. Robust statistics for outlier etection. WIREs DMKD, ():73 79, 20. [36] E. Schubert, R. Wojanowsi, A. Zime, an H.-P. Kriegel. On evaluation of outlier ranings an outlier scores. In Proc. SDM, pages , 202. [37] E. Schubert, A. Zime, an H.-P. Kriegel. Local outlier etection reconsiere: a generalize view on locality with applications to spatial, vieo, an networ outlier etection. Data Min. Knowl. Disc., 202. [38] T. Soler an M. Chin. On transformation of covariance matrices between local Cartesian coorinate systems an commutative iagrams. In ASP-ACSM Convention, pages , 985. [39] A. Strehl an J. Ghosh. Cluster ensembles a nowlege reuse framewor for combining multiple partitions. J. Mach. Learn. Res., 3:583 67, [40] A. Topchy, A. Jain, an W. Punch. Clustering ensembles: Moels of concensus an wea partitions. IEEE TPAMI, 27(2):866 88, [4] G. Valentini an F. Masulli. Ensembles of learning machines. In Proc. Neural Nets WIRN, pages 3 22, [42] N. H. Vu an V. Gopalrishnan. Efficient pruning schemes for istance-base outlier etection. In Proc. ECML PKDD, pages 60 75, [43] J. Yang, N. Zhong, Y. Yao, an J. Wang. Local peculiarity factor an its application in outlier etection. In Proc. KDD, pages , [44] K. Zhang, M. Hutter, an H. Jin. A new local istance-base outlier etection approach for scattere real-worl ata. In Proc. PAKDD, pages , [45] A. Zime, E. Schubert, an H.-P. Kriegel. A survey on unsupervise outlier etection in high-imensional numerical ata. Stat. Anal. Data Min., 5(5): , 202.

Review Article Statistical methods and common problems in medical or biomedical science research

Review Article Statistical methods and common problems in medical or biomedical science research Int J Physiol Pathophysiol Pharmacol 017;9(5):157-163 www.ijppp.org /ISSN:1944-8171/IJPPP006608 Review Article Statistical methos an common problems in meical or biomeical science research Fengxia Yan

More information

Knowledge Discovery and Data Mining I

Knowledge Discovery and Data Mining I Ludwig-Maximilians-Universität München Lehrstuhl für Datenbanksysteme und Data Mining Prof. Dr. Thomas Seidl Knowledge Discovery and Data Mining I Winter Semester 2018/19 Introduction What is an outlier?

More information

Ensembles for Unsupervised Outlier Detection: Challenges and Research Questions

Ensembles for Unsupervised Outlier Detection: Challenges and Research Questions Ensembles for Unsupervised Outlier Detection: Challenges and Research Questions [Position Paper] Arthur Zimek Ludwig-Maximilians-Universität Munich, Germany http://www.dbs.ifi.lmu.de zimek@dbs.ifi.lmu.de

More information

Audiological Bulletin no. 35

Audiological Bulletin no. 35 Auiological Bulletin no. 35 Ensuring the correct in-situ gain News from Auiological Research an Communication 9 502 1041 001 / 05-07 Introuction Hearing ais are commonly fitte accoring to ata base on a

More information

PERFORMANCE EVALUATION OF HIGHWAY MOBILE INFOSTATION NETWORKS

PERFORMANCE EVALUATION OF HIGHWAY MOBILE INFOSTATION NETWORKS PERFORMANCE EVALUATION OF HIGHWAY MOBILE INFOSTATION NETWORKS Wing Ho Yuen WINLAB Rutgers University Piscataway, NJ 8854 anyyuen@winlab.rutgers.eu Roy D. Yates WINLAB Rutgers University Piscataway, NJ

More information

Since many political theories assert that the

Since many political theories assert that the Improving Tests of Theories Positing Interaction William D. Berry Matt Goler Daniel Milton Floria State University Pennsylvania State University Brigham Young University It is well establishe that all

More information

META-ANALYSIS. Topic #11

META-ANALYSIS. Topic #11 ARTHUR PSYC 204 (EXPERIMENTAL PSYCHOLOGY) 16C LECTURE NOTES [11/09/16] META-ANALYSIS PAGE 1 Topic #11 META-ANALYSIS Meta-analysis can be escribe as a set of statistical methos for quantitatively aggregating

More information

A FORMATION BEHAVIOR FOR LARGE-SCALE MICRO-ROBOT FORCE DEPLOYMENT. Donald D. Dudenhoeffer Michael P. Jones

A FORMATION BEHAVIOR FOR LARGE-SCALE MICRO-ROBOT FORCE DEPLOYMENT. Donald D. Dudenhoeffer Michael P. Jones Proceeings of the 2000 Winter Simulation Conference J. A. Joines, R. R. Barton, K. Kang, an P. A. Fishwick, es. A FORMATION BEHAVIOR FOR LARGE-SCALE MICRO-ROBOT FORCE DEPLOYMENT Donal D. Duenhoeffer Michael

More information

Reporting Checklist for Nature Neuroscience

Reporting Checklist for Nature Neuroscience Corresponing Author: Manuscript Number: Manuscript Type: Kathryn V. Anerson an SongHai Shi NNA4806B Article Reporting Checklist for Nature Neuroscience # Main Figures: 7 # Supplementary Figures: 1 # Supplementary

More information

Modeling Latently Infected Cell Activation: Viral and Latent Reservoir Persistence, and Viral Blips in HIV-infected Patients on Potent Therapy

Modeling Latently Infected Cell Activation: Viral and Latent Reservoir Persistence, and Viral Blips in HIV-infected Patients on Potent Therapy Moeling Latently Infecte Cell Activation: Viral an Latent Reservoir Persistence, an Viral Blips in HIV-infecte Patients on Potent Therapy Libin Rong, Alan S. Perelson* Theoretical Biology an Biophysics,

More information

Fully Heterogeneous Collective Regression

Fully Heterogeneous Collective Regression Fully Heterogeneous Collective Regression ABSTRACT Davi J. Lietka Department of Computer Science Unite States Naval Acaemy Annapolis, Marylan lietka@gmail.com Prior work has emonstrate that multiple methos

More information

A PRELIMINARY STUDY OF MODELING AND SIMULATION IN INDIVIDUALIZED DRUG DOSAGE AZATHIOPRINE ON INFLAMMATORY BOWEL DISEASE

A PRELIMINARY STUDY OF MODELING AND SIMULATION IN INDIVIDUALIZED DRUG DOSAGE AZATHIOPRINE ON INFLAMMATORY BOWEL DISEASE This is a correcte version of the corresponing paper publishe in SIMS 26: Proceeings of the 47th Conference on Simulation an Moelling. Errata: equations.3 an.4 have been change to timecontinuous form an

More information

Influence of Neural Delay in Sensorimotor Systems on the Control Performance and Mechanism in Bicycle Riding

Influence of Neural Delay in Sensorimotor Systems on the Control Performance and Mechanism in Bicycle Riding Neural Information Processing Letters an Reviews Vol. 12, Nos. 1-3, January-March 28 Influence of Neural Delay in Sensorimotor Systems on the Control Performance an Mechanism in Bicycle Riing Yusuke Azuma

More information

Localization-based secret key agreement for wireless network

Localization-based secret key agreement for wireless network The University of Toleo The University of Toleo Digital Repository Theses an Dissertations 2015 Localization-base secret key agreement for wireless network Qiang Wu University of Toleo Follow this an aitional

More information

Reporting Checklist for Nature Neuroscience

Reporting Checklist for Nature Neuroscience Corresponing Author: Manuscript Number: Manuscript Type: Albert La Spaa NNA4471A Article Reporting Checklist for Nature Neuroscience # Main Figures: 8 # Supplementary Figures: 9 # Supplementary Tables:

More information

Supplementary Methods Enzyme expression and purification

Supplementary Methods Enzyme expression and purification Supplementary Methos Enzyme expression an purification he expression vector pjel236 (18) encoing the full length S. cerevisiae topoisomerase II enzyme fuse to an intein an a chitin bining omain was kinly

More information

Outlier Analysis. Lijun Zhang

Outlier Analysis. Lijun Zhang Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based

More information

Audiological Bulletin no. 31

Audiological Bulletin no. 31 Auiological Bulletin no. 31 The effect - an introuction News from Auiological Research an Communication 9 502 1043 001 / 05-07 Introuction Venting in earmouls has been use for many years to control the

More information

Clustered Encouragement Designs with Individual Noncompliance: Bayesian Inference with Randomization, and Application to Advance Directive Forms.

Clustered Encouragement Designs with Individual Noncompliance: Bayesian Inference with Randomization, and Application to Advance Directive Forms. To appear in Biostatistics (with Discussion). Clustere Encouragement Designs with Iniviual Noncompliance: Bayesian Inference with Ranomization, an Application to Avance Directive Forms. CONSTANTINE E.

More information

Mathematical Beta Cell Model for Insulin Secretion following IVGTT and OGTT

Mathematical Beta Cell Model for Insulin Secretion following IVGTT and OGTT Annals of Biomeical Engineering, Vol. 3, No. 8, August 2006 ( C 2006) pp. 33 35 DOI: 0.007/s039-006-95-0 Mathematical Beta Cell Moel for Insulin Secretion following IVGTT an OGTT RUNE V. OVERGAARD,, 2,

More information

Biomarkers of Nutritional Exposure and Nutritional Status

Biomarkers of Nutritional Exposure and Nutritional Status Biomarkers of Nutritional Exposure an Nutritional Status Laboratory Issues: Use of Nutritional Biomarkers 1 Heii Michels Blanck,* 2 Barbara A. Bowman, y Geral R. Cooper, z Gary L. Myers z an Dayton T.

More information

APPLICATION OF GOAL PROGRAMMING IN FARM AGRICULTURAL PLANNING

APPLICATION OF GOAL PROGRAMMING IN FARM AGRICULTURAL PLANNING APPLICATION OF GOAL PROGRAMMING IN FARM AGRICULTURAL PLANNING Dr.P.K.VASHISTHA, Dean Acaemics, Vivekanan Institute of Technology & Science, Ghaziaba vashisthapk@gmail.com ABSTRACT In this paper we present

More information

Interpreting and Unifying Outlier Scores

Interpreting and Unifying Outlier Scores Interpreting and Unifying Outlier cores Hans-Peter Kriegel Peer Kröger Erich chubert Arthur Zimek Institut für Informatik, Ludwig-Maximilians Universität München http://www.dbs.ifi.lmu.de {kriegel,kroegerp,schube,zimek}@dbs.ifi.lmu.de

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

A DISCRETE MODEL OF GLUCOSE-INSULIN INTERACTION AND STABILITY ANALYSIS A. & B.

A DISCRETE MODEL OF GLUCOSE-INSULIN INTERACTION AND STABILITY ANALYSIS A. & B. A DISCRETE MODEL OF GLUCOSE-INSULIN INTERACTION AND STABILITY ANALYSIS A. George Maria Selvam* & B. Bavya** Sacre Heart College, Tirupattur, Vellore, Tamilnau Abstract: The stability of a iscrete-time

More information

Studies With Staggered Starts: Multiple Baseline Designs and Group-Randomized Trials

Studies With Staggered Starts: Multiple Baseline Designs and Group-Randomized Trials Stuies With Staggere Starts: Multiple Baseline Designs an Group-Ranomize Trials Dale A. Rhoa, MAS, MS, MPP, Davi M. Murray, PhD, Rebecca R. Anrige, PhD, Michael L. Pennell, PhD, an Erinn M. Hae, MS The

More information

Dynamic Modeling of Behavior Change

Dynamic Modeling of Behavior Change Dynamic Moeling of Behavior Change H. T. Banks, Keri L. Rehm, Karyn L. Sutton Center for Research in Scientific Computation Center for Quantitative Science in Biomeicine North Carolina State University

More information

Information-Theoretic Outlier Detection For Large_Scale Categorical Data

Information-Theoretic Outlier Detection For Large_Scale Categorical Data www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 11 November, 2014 Page No. 9178-9182 Information-Theoretic Outlier Detection For Large_Scale Categorical

More information

Motivation: Fraud Detection

Motivation: Fraud Detection Outlier Detection Motivation: Fraud Detection http://i.imgur.com/ckkoaop.gif Jian Pei: CMPT 741/459 Data Mining -- Outlier Detection (1) 2 Techniques: Fraud Detection Features Dissimilarity Groups and

More information

Perceptions of harm from secondhand smoke exposure among US adults,

Perceptions of harm from secondhand smoke exposure among US adults, Perceptions of harm from seconhan smoke exposure among US aults, 2009-2010 Juy Kruger, Emory University Roshni Patel, Centers for Disease Control an Prevention Michelle Kegler, Emory University Steven

More information

VELDA: Relating an Image Tweet s Text and Images

VELDA: Relating an Image Tweet s Text and Images VELDA: Relating an Image Tweet s Text an Images Tao Chen 1 Hany M. SalahEleen 2 Xiangnan He 1 Min-Yen Kan 1,3 Dongyuan Lu 1 1 School of Computing, ational University of Singapore 2 Department of Computer

More information

Towards semantic and affective coupling in emotionally annotated databases

Towards semantic and affective coupling in emotionally annotated databases Towars semantic an affective coupling in emotionally annotate atabases M Horvat, S Popović an K Ćosić Faculty of Electrical Engineering an Computing, University of Zagreb Department of Electric Machines,

More information

Intention-to-Treat Analysis and Accounting for Missing Data in Orthopaedic Randomized Clinical Trials

Intention-to-Treat Analysis and Accounting for Missing Data in Orthopaedic Randomized Clinical Trials 2137 COPYRIGHT Ó 2009 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Intention-to-Treat Analysis an Accounting for Missing Data in Orthopaeic Ranomize Clinical Trials By Amir Herman, MD, MSc, Itamar

More information

6dB SNR improved 64 Channel Hearing Aid Development using CSR8675 Bluetooth Chip

6dB SNR improved 64 Channel Hearing Aid Development using CSR8675 Bluetooth Chip 016 International Conference on Computational Science an Computational Intelligence 6B SNR improve 64 Channel Hearing Ai Development using CSR8675 Bluetooth Chip S. S. Jarng Dept. of Electronics Eng. Chosun

More information

Optimal Precoding and MMSE Receiver Designs for MIMO WCDMA

Optimal Precoding and MMSE Receiver Designs for MIMO WCDMA Optimal Precoing an MMSE Receiver Designs for MIMO WCDMA Shakti Prasa Shenoy, Irfan Ghauri, Dirk T.M. Slock Infineon Technologies France SAS, GAIA, 26 Route es Crêtes, 656 Sophia Antipolis Cee, France

More information

Skeletal Age Assessment from the Olecranon for Idiopathic Scoliosis at Risser Grade 0

Skeletal Age Assessment from the Olecranon for Idiopathic Scoliosis at Risser Grade 0 This is an enhance PF from The Journal of Bone an Joint Surgery The PF of the article you requeste follows this cover page. Skeletal Age Assessment from the Olecranon for Iiopathic Scoliosis at Risser

More information

Competitive Helping in Online Giving

Competitive Helping in Online Giving Report Competitive Helping in Online Giving Graphical Abstract Authors Nichola J. Raihani, Sarah Smith Corresponence nicholaraihani@gmail.com In Brief Raihani an Smith show competitive helping in onations

More information

Binary Increase Congestion Control (BIC) for Fast Long-Distance Networks

Binary Increase Congestion Control (BIC) for Fast Long-Distance Networks Binary Increase Congestion Control () for Fast Long-Distance Networks Lisong Xu, Khale Harfoush, an Injong Rhee Department of Computer Science North Carolina State University Raleigh, NC 27695-7534 lxu2,

More information

Factorial HMMs with Collapsed Gibbs Sampling for Optimizing Long-term HIV Therapy

Factorial HMMs with Collapsed Gibbs Sampling for Optimizing Long-term HIV Therapy Factorial HMMs with Collapse Gibbs Sampling for ptimizing Long-term HIV Therapy Amit Gruber 1,, Chen Yanover 1, Tal El-Hay 1, Aners Sönnerborg 2 Vanni Borghi 3, Francesca Incarona 4, Yaara Golschmit 1

More information

Analysis of Observational Studies: A Guide to Understanding Statistical Methods

Analysis of Observational Studies: A Guide to Understanding Statistical Methods 50 COPYRIGHT Ó 2009 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Analysis of Observational Stuies: A Guie to Unerstaning Statistical Methos By Saam Morshe, MD, MPH, Paul Tornetta III, MD, an

More information

Volume 5, Issue 4, April 2017 International Journal of Advance Research in Computer Science and Management Studies

Volume 5, Issue 4, April 2017 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) e-isjn: A4372-3114 Impact Factor: 6.047 Volume 5, Issue 4, April 2017 International Journal of Avance Research in Computer Science an Management Stuies Research Article / Survey

More information

USING BAYESIAN NETWORKS TO MODEL AGENT RELATIONSHIPS

USING BAYESIAN NETWORKS TO MODEL AGENT RELATIONSHIPS Ó Applie ArtiÐcial Intelligence, 14 :867È879, 2000 Copyright 2000 Taylor & Francis 0883-9514 /00 $12.00 1.00 USING BAYESIAN NETWORKS TO MODEL AGENT RELATIONSHIPS BIKRAMJIT BANERJEE, ANISH BISWAS, MANISHA

More information

the Orthopaedic forum Is There Truly No Significant Difference? Underpowered Randomized Controlled Trials in the Orthopaedic Literature

the Orthopaedic forum Is There Truly No Significant Difference? Underpowered Randomized Controlled Trials in the Orthopaedic Literature 2068 COPYRIGHT Ó 2015 BY THE JOURNAL OF BONE AN JOINT SURGERY, INCORPORATE the Orthopaeic forum Is There Truly No Significant ifference? Unerpowere Ranomize Controlle Trials in the Orthopaeic Literature

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeare in a journal publishe by Elsevier. The attache copy is furnishe to the author for internal non-commercial research an eucation use, incluing for instruction at the authors institution

More information

Cost-Effectiveness of Antibiotic-Impregnated Bone Cement Used in Primary Total Hip Arthroplasty

Cost-Effectiveness of Antibiotic-Impregnated Bone Cement Used in Primary Total Hip Arthroplasty This is an enhance PDF from The Journal of Bone an Joint Surgery The PDF of the article you requeste follows this cover page. Cost-Effectiveness of Antibiotic-Impregnate Bone Cement Use in Primary Total

More information

Identifying Factors Related to the Survival of AIDS Patients under the Follow-up of Antiretroviral Therapy (ART): The Case of South Wollo

Identifying Factors Related to the Survival of AIDS Patients under the Follow-up of Antiretroviral Therapy (ART): The Case of South Wollo International Journal of Data Envelopment Analysis an *Operations Research*, 014, Vol. 1, No., 1-7 Available online at http://pubs.sciepub.com/ijeaor/1// Science an Eucation Publishing DOI:10.1691/ijeaor-1--

More information

A simple mathematical model of the bovine estrous cycle: follicle development and endocrine interactions

A simple mathematical model of the bovine estrous cycle: follicle development and endocrine interactions Konra-Zuse-Zentrum für Informationstechnik Berlin Takustraße 7 D-14195 Berlin-Dahlem Germany H.M.T.BOER, C.STÖTZEL, S.RÖBLITZ, P.DEUFLHARD, R.F.VEERKAMP, H.WOELDERS A simple mathematical moel of the bovine

More information

WANTED Species Survival Plan Coordinator

WANTED Species Survival Plan Coordinator WANTED Species Survival Plan Coorinator Knowlegeable zoo or aquarium professional to manage propagation of hunres of animals locate in several states an countries. Must be verse in genetics, sophisticate

More information

An Adaptive Load Sharing Algorithm for Heterogeneous Distributed System

An Adaptive Load Sharing Algorithm for Heterogeneous Distributed System An Aaptive Loa Sharing Algorithm for Heterogeneous Distribute System P.Neelakantan, A.Rama Mohan Rey Abstract Due to the restriction of esigning faster an faster computers, one has to fin the ways to maximize

More information

MINING OF OUTLIER DETECTION IN LARGE CATEGORICAL DATASETS

MINING OF OUTLIER DETECTION IN LARGE CATEGORICAL DATASETS MINING OF OUTLIER DETECTION IN LARGE CATEGORICAL DATASETS Mrs. Ramalan Kani K 1, Ms. N.Radhika 2 1 M.TECH Student, Department of computer Science and Engineering, PRIST University, Trichy 2 Asst.Professor,

More information

On the Expected Connection Lifetime and Stochastic Resilience of Wireless Multi-hop Networks

On the Expected Connection Lifetime and Stochastic Resilience of Wireless Multi-hop Networks On the Expecte Cnecti Lifetime an Stochastic Resilience of Wireless Multi-hop Networks Fei Xing Wenye Wang Department of Electrical an Computer Engineering North Carolina State University, Raleigh, NC

More information

Reverse Shoulder Arthroplasty for the Treatment of Rotator Cuff Deficiency

Reverse Shoulder Arthroplasty for the Treatment of Rotator Cuff Deficiency 1895 COPYRIGHT Ó 2017 BY THE JOURAL OF BOE AD JOIT SURGERY, ICORPORATED Reverse Shouler Arthroplasty for the Treatment of Rotator Cuff Deficiency A Concise Follow-up, at a Minimum of 10 Years, of Previous

More information

Singer-Loomis Report

Singer-Loomis Report Name/Coename: Agent X Singer-Loomis Report TM Base On: Singer-Loomis Type Deployment Inventory (SL-TDI ) DEVELOPED BY June Singer, Ph.D. Elizabeth Kirkhart, Ph.D. Mary Loomis, Ph. D. Larry Kirkhart, Ph.

More information

X 2. s 1 n 1 s 2. n 2. s 2. 2 r 12

X 2. s 1 n 1 s 2. n 2. s 2. 2 r 12 Homework for t-tests -- one sample, two inepenent samples, an correlate samples Formulas X One sample t-test: t s/ n Two inepenent samples t-test: t X SE X s 1 s n 1 n Correlate samples t-test: t X SE

More information

SURVEY ON OUTLIER DETECTION TECHNIQUES USING CATEGORICAL DATA

SURVEY ON OUTLIER DETECTION TECHNIQUES USING CATEGORICAL DATA SURVEY ON OUTLIER DETECTION TECHNIQUES USING CATEGORICAL DATA K.T.Divya 1, N.Senthil Kumaran 2 1Research Scholar, Department of Computer Science, Vellalar college for Women, Erode, Tamilnadu, India 2Assistant

More information

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes. Final review Based in part on slides from textbook, slides of Susan Holmes December 5, 2012 1 / 1 Final review Overview Before Midterm General goals of data mining. Datatypes. Preprocessing & dimension

More information

Duration of the Increase in Early Postoperative Mortality After Elective Hip and Knee Replacement

Duration of the Increase in Early Postoperative Mortality After Elective Hip and Knee Replacement This is an enhance PDF from The Journal of Bone an Joint Surgery The PDF of the article you requeste follows this cover page. Duration of the Increase in Early Postoperative Mortality After Elective Hip

More information

c 2007 Society for Industrial and Applied Mathematics

c 2007 Society for Industrial and Applied Mathematics SIAM J. APPL. MATH. Vol. 7, No. 3, pp. 73 75 c 27 Society for Inustrial an Applie Mathematics MATHEMATICAL ANALYSIS OF AGE-STRUCTURED HIV- DYNAMICS WITH COMBINATION ANTIRETROVIRAL THERAPY LIBIN RONG, ZHILAN

More information

American Academy of Periodontology Best Evidence Consensus Statement on Selected Oral Applications for Cone-Beam Computed Tomography

American Academy of Periodontology Best Evidence Consensus Statement on Selected Oral Applications for Cone-Beam Computed Tomography J Perioontol October 2017 American Acaemy of Perioontology Best Evience Consensus Statement on Selecte Oral Applications for Cone-Beam Compute Tomography George A. Manelaris,* E. To Scheyer, Marianna Evans,

More information

Winner s Report: KDD CUP Breast Cancer Identification

Winner s Report: KDD CUP Breast Cancer Identification Winner s Report: KDD CUP Breast Cancer Identification ABSTRACT Claudia Perlich, Prem Melville, Yan Liu, Grzegorz Świrszcz, Richard Lawrence IBM T.J. Watson Research Center Yorktown Heights, NY 10598 {perlich,pmelvil,liuya}@us.ibm.com

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017 RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science

More information

Improved Accuracy of Component Positioning with Robotic-Assisted Unicompartmental Knee Arthroplasty

Improved Accuracy of Component Positioning with Robotic-Assisted Unicompartmental Knee Arthroplasty 627 COPYRIGHT Ó 2016 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Improve Accuracy of Component Positioning with Robotic-Assiste Unicompartmental Knee Arthroplasty Data from a Prospective, Ranomize

More information

Gary L. Grove, PhD, and Chou I. Eyberg, MS. Investigation performed at cyberderm Clinical Studies, Broomall, Pennsylvania

Gary L. Grove, PhD, and Chou I. Eyberg, MS. Investigation performed at cyberderm Clinical Studies, Broomall, Pennsylvania 1187 COPYRIGHT Ó 2012 BY THE OURNAL OF BONE AND OINT SURGERY, INCORPORATED Comparison of Two Preoperative Skin Antiseptic Preparations an Resultant Surgical Incise Drape Ahesion to Skin in Healthy Volunteers

More information

A Clinical Decision Support Tool for Familial Hypercholesterolemia Based on Physician Input

A Clinical Decision Support Tool for Familial Hypercholesterolemia Based on Physician Input ORIGINAL ARTICLE A Clinical Decision Support Tool for Familial Hypercholesterolemia Base on Physician Input Ali A. Hasnie, MD; Ashok Kumbamu, PhD; Maya S. Safarova, MD, PhD; Pero J. Caraballo, MD; an Iftikhar

More information

Statistical Consideration for Bilateral Cases in Orthopaedic Research

Statistical Consideration for Bilateral Cases in Orthopaedic Research 1732 COPYRIGHT Ó 2010 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Statistical Consieration for Bilateral Cases in Orthopaeic Research By Moon Seok Park, MD, Sung Ju Kim, MS, Chin Youb Chung,

More information

UC Berkeley UC Berkeley Previously Published Works

UC Berkeley UC Berkeley Previously Published Works UC Berkeley UC Berkeley Previously Publishe Works Title Variability in Costs Associate with Total Hip an Knee Replacement Implants Permalink https://escholarship.org/uc/item/67z1b71r Journal The Journal

More information

A Prospective Randomized Study of Minimally Invasive Total Knee Arthroplasty Compared with Conventional Surgery

A Prospective Randomized Study of Minimally Invasive Total Knee Arthroplasty Compared with Conventional Surgery This is an enhance PDF from The Journal of Bone an Joint Surgery The PDF of the article you requeste follows this cover page. A Prospective Ranomize Stuy of Total Knee Arthroplasty Compare with Conventional

More information

Legg-Calvé-Perthes Disease: A Review of Cases with Onset Before Six Years of Age

Legg-Calvé-Perthes Disease: A Review of Cases with Onset Before Six Years of Age This is an enhance PF from The Journal of Bone an Joint Surgery The PF of the article you requeste follows this cover page. Legg-Calvé-Perthes isease: A Review of Cases with Onset Before Six Years of Age

More information

Trend Toward High-Volume Hospitals and the Influence on Complications in Knee and Hip Arthroplasty

Trend Toward High-Volume Hospitals and the Influence on Complications in Knee and Hip Arthroplasty 707 COPYRIGHT Ó 2016 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED A commentary by Davi W. Manning, MD, is linke to the online version of this article at jbjs.org. Tren Towar High-Volume Hospitals

More information

Host-vector interaction in dengue: a simple mathematical model

Host-vector interaction in dengue: a simple mathematical model Host-vector interaction in engue: a simple mathematical moel K Tennakone, L Ajith De Silva (Inex wors: engue, engue moel, engue Sri Lanka, enemic equilibrium, engue virus iversity) Abstract Introuction

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

Recurrent Neural Networks for Multivariate Time Series with Missing Values

Recurrent Neural Networks for Multivariate Time Series with Missing Values www.nature.com/scientificreports Receive: 1 November 2017 Accepte: 26 March 2018 Publishe: xx xx xxxx OPEN Recurrent Neural Networks for Multivariate Time Series with Missing Values Zhengping Che 1, Sanjay

More information

Analysis and Simulations of Dynamic Models of Hepatitis B Virus

Analysis and Simulations of Dynamic Models of Hepatitis B Virus Analysis an Simulations of Dynamic Moels of Hepatitis B Virus Xisong Dong (Corresponing author) National Engineering Laboratory for Disaster Backup an Recovery Beijing University of Posts an Telecommunications

More information

Analyzing the impact of modeling choices and assumptions in compartmental epidemiological models

Analyzing the impact of modeling choices and assumptions in compartmental epidemiological models Simulation Special Section on Meical Simulation Analyzing the impact of moeling choices an assumptions in compartmental epiemiological moels Simulation: Transactions of the Society for Moeling an Simulation

More information

Analyzing the Impact of Modeling Choices and Assumptions in Compartmental Epidemiological Models

Analyzing the Impact of Modeling Choices and Assumptions in Compartmental Epidemiological Models Analyzing the Impact of Moeling Choices an Assumptions in Compartmental Epiemiological Moels Journal Title XX(X):1 11 c The Author(s) 2016 Reprints an permission: sagepub.co.uk/journalspermissions.nav

More information

Computer-Assisted Surgical Navigation Does Not Improve the Alignment and Orientation of the Components in Total Knee Arthroplasty

Computer-Assisted Surgical Navigation Does Not Improve the Alignment and Orientation of the Components in Total Knee Arthroplasty This is an enhance PDF from The Journal of Bone an Joint Surgery The PDF of the article you requeste follows this cover page. Computer-Assiste Surgical Navigation Does Not Improve the Alignment an Orientation

More information

The incidence of treated end-stage renal disease in New Zealand Maori and Pacific Island people and in Indigenous Australians

The incidence of treated end-stage renal disease in New Zealand Maori and Pacific Island people and in Indigenous Australians Nephrol Dial Transplant (2004) 19: 678 685 DOI: 10.1093/nt/gfg592 Original Article The incience of treate en-stage renal isease in New Zealan Maori an Pacific Islan people an in Inigenous Australians John

More information

Three-Dimensional Analysis of Acute Scaphoid Fracture Displacement: Proximal Extension Deformity of the Scaphoid

Three-Dimensional Analysis of Acute Scaphoid Fracture Displacement: Proximal Extension Deformity of the Scaphoid 141 COPYRIGHT Ó 2017 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Three-Dimensional Analysis of Acute Scaphoi Fracture Displacement: Proximal Extension Deformity of the Scaphoi Yonatan Schwarcz,

More information

Experimental Study on Strength Evaluation Applied for Teeth Extraction: An In Vivo Study

Experimental Study on Strength Evaluation Applied for Teeth Extraction: An In Vivo Study Sen Orers of Reprints at reprints@benthamscience.net 2 The Open Dentistry Journal, 213, 7, 2-26 Open Access Experimental Stuy on Strength Evaluation Applie for Teeth Extraction: An In Vivo Stuy Marco Cicciù

More information

Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge

Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge Bioinformatics, 34, 2018, i395 i403 oi: 10.1093/bioinformatics/bty257 ISMB 2018 Improving genomics-base preictions for precision meicine through active elicitation of expert knowlege Iiris Sunin 1,, Tomi

More information

As information technologies and applications

As information technologies and applications COMPUTING PRACTICES Using Coplink to Analyze Criminal-Justice Data The Coplink system applies a concept space a statistics-base, algorithmic technique that ientifies relationships between suspects, victims,

More information

A Vital Sign and Sleep Monitoring Using Millimeter Wave

A Vital Sign and Sleep Monitoring Using Millimeter Wave A Vital Sign an Sleep Monitoring Using Millimeter Wave ZHICHENG YANG, University of California, Davis PARTH H. PATHAK, George Mason University YUNZE ZENG, University of California, Davis XIXI LIRAN, University

More information

CAN Tree Routing for Content-Addressable Network

CAN Tree Routing for Content-Addressable Network Sensors & Transucers 2014 by IFSA Publishing, S. L. htt://www.sensorsortal.com CAN Tree Routing for Content-Aressable Network Zhongtao LI, Torben WEIS University Duisburg-Essen, Universität Duisburg-Essen

More information

Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge

Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge https://hela.helsinki.fi Improving genomics-base preictions for precision meicine through active elicitation of expert knowlege Sunin, Iiris 2018-07-01 Sunin, I, Peltola, T, Micallef, L, Afrabanpey, H,

More information

Distal extension of the direct anterior approach to the hip poses risk to neurovascular structures: an anatomical study

Distal extension of the direct anterior approach to the hip poses risk to neurovascular structures: an anatomical study Zurich Open Repository an Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2015 Distal extension of the irect anterior approach to the hip poses risk to

More information

Asymmetric lateral distribution of melanoma and Merkel cell carcinoma in the United States

Asymmetric lateral distribution of melanoma and Merkel cell carcinoma in the United States Asymmetric lateral istribution of melanoma an Merkel cell carcinoma in the Unite States KellyG.Paulson,PhD,JayasriG.Iyer,MD,anPaulNghiem,MD,PhD Seattle, Washington Backgroun: A recent report suggeste a

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

A Propensity-Matched Cohort Study

A Propensity-Matched Cohort Study 380 COPYRIGHT Ó 2014 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Delaye Woun Closure Increases Deep-Infection Rate Associate with Lower-Grae Open Fractures A Propensity-Matche Cohort Stuy Richar

More information

A reduced ODE model of the bovine estrous cycle

A reduced ODE model of the bovine estrous cycle Konra-Zuse-Zentrum für Informationstechnik Berlin Takustraße 7 D-14195 Berlin-Dahlem Germany C. STÖTZEL, M. APRI, S. RÖBLITZ A reuce ODE moel of the bovine estrous cycle ZIB Report 14-33 (August 2014)

More information

Variable Features Selection for Classification of Medical Data using SVM

Variable Features Selection for Classification of Medical Data using SVM Variable Features Selection for Classification of Medical Data using SVM Monika Lamba USICT, GGSIPU, Delhi, India ABSTRACT: The parameters selection in support vector machines (SVM), with regards to accuracy

More information

Downloaded from:

Downloaded from: Eames, KTD (2007) Contact tracing strategies in heterogeneous populations. Epiemiology an infection, 135 (3). pp. 443-454. ISSN 0950-2688 DOI: https://oi.org/10.1017/s0950268806006923 Downloae from: http://researchonline.lshtm.ac.uk/6930/

More information

AMERICAN THORACIC SOCIETY DOCUMENTS

AMERICAN THORACIC SOCIETY DOCUMENTS AMERICAN THORACIC SOCIETY DOCUMENTS An Official American Thoracic Society Research Statement: Current Challenges Facing Research an Therapeutic Avances in Airway Remoeling Y. S. Prakash, Anrew J. Halayko,

More information

By Edmund Lau, MS, Kevin Ong, PhD, Steven Kurtz, PhD, Jordana Schmier, MA, and Av Edidin, PhD

By Edmund Lau, MS, Kevin Ong, PhD, Steven Kurtz, PhD, Jordana Schmier, MA, and Av Edidin, PhD 1479 COPYRIGHT Ó 2008 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED Mortality Following the Diagnosis of a Vertebral Compression Fracture in the Meicare Population By Emun Lau, MS, Kevin Ong,

More information

By Jae Kwang Kim, MD, PhD, Young-Do Koh, MD, PhD, and Nam-Hoon Do, MD

By Jae Kwang Kim, MD, PhD, Young-Do Koh, MD, PhD, and Nam-Hoon Do, MD 1 COPYRIGHT Ó 2010 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED A commentary by Moheb S. Moneim, MD, is available at www.jbjs.org/commentary an as supplemental material to the online version

More information

How to Design a Good Case Series

How to Design a Good Case Series 21 COPYRIGHT Ó 2009 BY THE JOURNAL OF BONE AND JOINT SURGERY, INCORPORATED How to Design a Goo Case Series By Bauke Kooistra, BSc, Bernaette Dijkman, BSc, Thomas A. Einhorn, MD, an Mohit Bhanari, MD, MSc,

More information

Background. Aim. Design and setting. Method. Results. Conclusion. Keywords

Background. Aim. Design and setting. Method. Results. Conclusion. Keywords Research Ebun A Abarshi, Michael A Echtel, Lieve Van en Block, Gé A Donker, Luc Deliens an Bregje D Onwuteaka-Philipsen Recognising patients who will ie in the near future: a nationwie stuy via the Dutch

More information

A scored AUC Metric for Classifier Evaluation and Selection

A scored AUC Metric for Classifier Evaluation and Selection A scored AUC Metric for Classifier Evaluation and Selection Shaomin Wu SHAOMIN.WU@READING.AC.UK School of Construction Management and Engineering, The University of Reading, Reading RG6 6AW, UK Peter Flach

More information

An Empirical and Formal Analysis of Decision Trees for Ranking

An Empirical and Formal Analysis of Decision Trees for Ranking An Empirical and Formal Analysis of Decision Trees for Ranking Eyke Hüllermeier Department of Mathematics and Computer Science Marburg University 35032 Marburg, Germany eyke@mathematik.uni-marburg.de Stijn

More information

Corticosteroid injection in diabetic patients with trigger finger: A prospective, randomized, controlled double-blinded study

Corticosteroid injection in diabetic patients with trigger finger: A prospective, randomized, controlled double-blinded study Washington University School of Meicine igital Commons@Becker Open Access Publications 12-1-2007 Corticosteroi injection in iabetic patients with trigger finger: A prospective, ranomize, controlleouble-bline

More information