arxiv: v1 [cs.lg] 28 Nov 2017

Size: px
Start display at page:

Download "arxiv: v1 [cs.lg] 28 Nov 2017"

Transcription

1 Snorkel: Rapid Training Data Creation with Weak Supervision Alexander Ratner Stephen H. Bach Henry Ehrenberg Jason Fries Sen Wu Christopher Ré Stanford University Stanford, CA, USA {ajratner, bach, henryre, jfries, senwu, arxiv: v1 [cs.lg] 28 Nov 2017 ABSTRACT Labeling training data is increasingly the largest bottleneck in deploying achine learning systes. We present Snorkel, a first-of-its-kind syste that enables users to train stateof-the-art odels without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to ground truth by incorporating the first end-to-end ipleentation of our recently proposed achine learning paradig, data prograing. We present a flexible interface layer for writing labeling functions based on our experience over the past year collaborating with copanies, agencies, and research labs. In a user study, subject atter experts build odels 2.8 faster and increase predictive perforance an average 45.5% versus seven hours of hand labeling. We study the odeling tradeoffs in this new setting and propose an optiizer for autoating tradeoff decisions that gives up to 1.8 speedup per pipeline execution. In two collaborations, with the U.S. Departent of Veterans Affairs and the U.S. Food and Drug Adinistration, and on four open-source text and iage data sets representative of other deployents, Snorkel provides 132% average iproveents to predictive perforance over prior heuristic approaches and coes within an average 3.60% of the predictive perforance of large hand-curated training sets. PVLDB Reference Forat: A. Ratner, S. H. Bach, H. Ehrenberg, J. Fries, S. Wu, C. Ré. Snorkel: Rapid Training Data Creation with Weak Supervision. PVLDB, 11 (3): xxxx-yyyy, DOI: / INTRODUCTION In the last several years, there has been an explosion of interest in achine-learning-based systes across industry, governent, and acadeia, with an estiated spend this year of $12.5 billion [1]. A central driver has been the Perission to ake digital or hard copies of all or part of this work for personal or classroo use is granted without fee provided that copies are not ade or distributed for profit or coercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific perission and/or a fee. Articles fro this volue were invited to present their results at The 44th International Conference on Very Large Data Bases, August 2018, Rio de Janeiro, Brazil. Proceedings of the VLDB Endowent, Vol. 11, No. 3 Copyright 2017 VLDB Endowent /17/11... $ DOI: / LABEL SOURCE 1 Accuracy: 90% LABEL SOURCE 2 Accuracy: 60% 1k labels 100k labels UNLABELED DATA Figure 1: In Exaple 1.1, training data is labeled by sources of differing accuracy and coverage. Two key challenges arise in using this weak supervision effectively. First, we need a way to estiate the unknown source accuracies to resolve disagreeents. Second, we need to pass on this critical lineage inforation to the end odel being trained. advent of deep learning techniques, which can learn taskspecific representations of input data, obviating what used to be the ost tie-consuing developent task: feature engineering. These learned representations are particularly effective for tasks like natural language processing and iage analysis, which have high-diensional, high-variance input that is ipossible to fully capture with siple rules or handengineered features [14, 17]. However, deep learning has a ajor upfront cost: these ethods need assive training sets of labeled exaples to learn fro often tens of thousands to illions to reach peak predictive perforance [47]. Such training sets are enorously expensive to create, especially when doain expertise is required. For exaple, reading scientific papers, analyzing intelligence data, and interpreting edical iages all require labeling by trained subject atter experts (SMEs). Moreover, we observe fro our engageents with collaborators like research labs and ajor technology copanies that odeling goals such as class definitions or granularity change as projects progress, necessitating re-labeling. Soe big copanies are able to absorb this cost, hiring large teas to label training data [12,16,31]. However, the bulk of practitioners are increasingly turning to weak supervision: cheaper sources of labels that are noisier or heuristic. The ost popular for is distant supervision, in which the records of an external knowledge base are heuristically aligned with data points to produce noisy labels [4, 7, 32]. Other fors include crowdsourced labels [37, 50], rules and heuristics for labeling data [39, 52], and others [29, 30, 30, 46, 51]. While these sources are inexpensive, they often have liited accuracy and coverage.

2 Ideally, we would cobine the labels fro any weak supervision sources to increase the accuracy and coverage of our training set. However, two key challenges arise in doing so effectively. First, sources will overlap and conflict, and to resolve their conflicts we need to estiate their accuracies and correlation structure, without access to ground truth. Second, we need to pass on critical lineage inforation about label quality to the end odel being trained. Exaple 1.1. In Figure 1, we obtain labels fro a high accuracy, low coverage Source 1, and fro a low accuracy, high coverage Source 2, which overlap and disagree (splitcolor points). If we take an unweighted ajority vote to resolve conflicts, we end up with null (tie-vote) labels. If we could correctly estiate the source accuracies, we would resolve conflicts in the direction of Source 1. We would still need to pass this inforation on to the end odel being trained. Suppose that we took labels fro Source 1 where available, and otherwise took labels fro Source 2. Then, the expected training set accuracy would be 60.3% only arginally better than the weaker source. Instead we should represent training label lineage in end odel training, weighting labels generated by high-accuracy sources ore. In recent work, we developed data prograing as a paradig for addressing both of these challenges by odeling ultiple label sources without access to ground truth, and generating probabilistic training labels representing the lineage of the individual labels. We prove that, surprisingly, we can recover source accuracy and correlation structure without hand-labeled training data [5, 38]. However, there are any practical aspects of ipleenting and applying this abstraction that have not been previously considered. We present Snorkel, the first end-to-end syste for cobining weak supervision sources to rapidly create training data. We built Snorkel as a prototype to study how people could use data prograing, a fundaentally new approach to building achine learning applications. Through weekly hackathons and office hours held at Stanford University over the past year, we have interacted with a growing user counity around Snorkel s open source ipleentation. 1 We have observed SMEs in industry, science, and governent deploying Snorkel for knowledge base construction, iage analysis, bioinforatics, fraud detection, and ore. Fro this experience, we have distilled three principles that have shaped Snorkel s design: 1. Bring All Sources to Bear: The syste should enable users to opportunistically use labels fro all available weak supervision sources. 2. Training Data as the Interface to ML: The syste should odel label sources to produce a single, probabilistic label for each data point and train any of a wide range of classifiers to generalize beyond those sources. 3. Supervision as Interactive Prograing: The syste should provide rapid results in response to user supervision. We envision weak supervision as the REPL-like interface for achine learning. Our work akes the following technical contributions: A Flexible Interface for Sources: We observe that the heterogeneity of weak supervision strategies is a stubling block for developers. Different types of weak supervision 1 operate on different scopes of the input data. For exaple, distant supervision has to be apped prograatically to specific spans of text. Crowd workers and weak classifiers often operate over entire docuents or iages. Heuristic rules are open ended; they can leverage inforation fro ultiple contexts siultaneously, such as cobining inforation fro a docuent s title, naed entities in the text, and knowledge bases. This heterogeneity was cubersoe enough to copletely block users of early versions of Snorkel. To address this challenge, we built an interface layer around the abstract concept of a labeling function (LF). We developed a flexible language for expressing weak supervision strategies and supporting data structures. We observed accelerated user productivity with these tools, which we validated in a user study where SMEs build odels 2.8 faster and increase predictive perforance an average 45.5% versus seven hours of hand labeling. Tradeoffs in Modeling of Sources: Snorkel learns the accuracies of weak supervision sources without access to ground truth using a generative odel [38]. Furtherore, it also learns correlations and other statistical dependencies aong sources, correcting for dependencies in labeling functions that skew the estiated accuracies [5]. This paradig gives rise to previously unexplored tradeoff spaces between predictive perforance and speed. The natural first question is: when does odeling the accuracies of sources iprove predictive perforance? Further, how any dependencies, such as correlations, are worth odeling? We study the tradeoffs between predictive perforance and training tie in generative odels for weak supervision. While odeling source accuracies and correlations will not hurt predictive perforance, we present a theoretical analysis of when a siple ajority vote will work just as well. Based on our conclusions, we introduce an optiizer for deciding when to odel accuracies of labeling functions, and when learning can be skipped in favor of a siple ajority vote. Further, our optiizer autoatically decides which correlations to odel aong labeling functions. This optiizer correctly predicts the advantage of generative odeling over ajority vote to within 2.16 accuracy points on average on our evaluation tasks, and accelerates pipeline executions by up to 1.8. It also enables us to gain 60% 70% of the benefit of correlation learning while saving up to 61% of training tie (34 inutes per execution). First End-to-End Syste for Data Prograing: Snorkel is the first syste to ipleent our recent work on data prograing [5,38]. Previous ML systes that we and others developed [52] required extensive feature engineering and odel specification, leading to confusion about where to inject relevant doain knowledge. While prograing weak supervision sees superficially siilar to feature engineering, we observe that users approach the two processes very differently. Our vision weak supervision as the sole port of interaction for achine learning iplies radically different workflows, requiring a proof of concept. Snorkel deonstrates that this paradig enables users to develop high-quality odels for a wide range of tasks. We report on two deployents of Snorkel, in collaboration with the U.S. Departent of Veterans Affairs and Stanford Hospital and Clinics, and the U.S. Food and Drug Adinistration, where Snorkel iproves over heuristic baselines by an average 110%. We also report results on four open-

3 We We study a patient who becae quadriplegic after parenteral agnesiu adinistration for for for preeclapsia. External KBs Patterns & dictionaries Doain Heuristics UNLABELED DATA Subset A Subset B Subset C causes, induces, linked to, aggravates, Cheicals of type A should be harless WEAK SUPERVISION SOURCES Docuent Sentence Span Entity CONTEXT HIERARCHY Ontology(ctd, [A, B, -C]) Pattern( {{0}}causes{{1}} ) CustoFn(x,y : heuristic(x,y)) LABELING FUNCTION INTERFACE Λ LABEL MATRIX MODELING OPTIMIZER Λ $ Λ # Λ " GENERATIVE MODEL SNORKEL Y Y& PROBABILISTIC TRAINING DATA DISCRIMINATIVE MODEL Figure 2: An overview of the Snorkel syste. (1) SME users write labeling functions (LFs) that express weak supervision sources like distant supervision, patterns, and heuristics. (2) Snorkel applies the LFs over unlabeled data and learns a generative odel to cobine the LFs outputs into probabilistic labels. (3) Snorkel uses these labels to train a discriinative classification odel, such as a deep neural network. source datasets that are representative of other Snorkel deployents, including bioinforatics, edical iage analysis, and crowdsourcing; on which Snorkel beats heuristics by an average 153% and coes within an average 3.60% of the predictive perforance of large hand-curated training sets. 2. SNORKEL ARCHITECTURE Snorkel s workflow is designed around data prograing [5, 38], a fundaentally new paradig for training achine learning odels using weak supervision, and proceeds in three ain stages (Figure 2): 1. Writing Labeling Functions: Rather than hand-labeling training data, users of Snorkel write labeling functions, which allow the to express various weak supervision sources such as patterns, heuristics, external knowledge bases, and ore. This was the coponent ost infored by early interactions (and istakes) with users over the last year of deployent, and we present a flexible interface and supporting data odel. 2. Modeling Accuracies and Correlations: Next, Snorkel autoatically learns a generative odel over the labeling functions, which allows it to estiate their accuracies and correlations. This step uses no ground-truth data, learning instead fro the agreeents and disagreeents of the labeling functions. We observe that this step iproves end predictive perforance 5.81% over Snorkel with unweighted label cobination, and anecdotally that it strealines the user developent experience by providing actionable feedback about labeling function quality. 3. Training a Discriinative Model: The output of Snorkel is a set of probabilistic labels that can be used to train a wide variety of state-of-the-art achine learning odels, such as popular deep learning odels. While the generative odel is essentially a re-weighted cobination of the user-provided labeling functions which tend to be precise but low-coverage odern discriinative odels can retain this precision while learning to generalize beyond the labeling functions, increasing coverage and robustness on unseen data. Next we set up the proble Snorkel addresses and describe its ain coponents and design decisions. Setup: Our goal is to learn a paraeterized classification odel h θ that, given a data point x X, predicts its label y Y, where the set of possible labels Y is discrete. For siplicity, we focus on the binary setting Y = { 1, 1}, though we include a ulti-class application in our experients. For exaple, x ight be a edical iage, and y a label indicating noral versus abnoral. In the relation extraction exaples we look at, we often refer to x as a candidate. In a traditional supervised learning setup, we would learn h θ by fitting it to a training set of labeled data points. However, in our setting, we assue that we only have access to unlabeled data for training. We do assue access to a sall set of labeled data used during developent, called the developent set, and a blind, held-out labeled test set for evaluation. These sets can be orders of agnitudes saller than a training set, aking the econoical to obtain. The user of Snorkel ais to generate training labels by providing a set of labeling functions, which are black-box functions, λ : X Y { }, that take in a data point and output a label where we use to denote that the labeling functions abstains. Given unlabeled data points and n labeling functions, Snorkel applies the labeling functions over the unlabeled data to produce a atrix of labeling function outputs Λ (Y { }) n. The goal of the reaining Snorkel pipeline is to synthesize this label atrix Λ which ay contain overlapping and conflicting labels for each data point into a single vector of probabilistic training labels Ỹ = (ỹ1,..., ỹ), where ỹi [0, 1]. These training labels can then be used to train a discriinative odel. Next, we introduce the running exaple of a text relation extraction task as a proxy for any real-world knowledge base construction and data analysis tasks: Exaple 2.1. Consider the task of extracting entions of adverse cheical-disease relations fro the bioedical literature (see CDR task, Section 4.1). Given docuents with entions of cheicals and diseases tagged, we refer to each co-occuring (cheical, disease) ention pair as a candidate extraction, which we view as a data point to be classified as either true or false. For exaple, in Figure 2, we would have two candidates with true labels y 1 = True and y 2 = False: x 1 = Causes (" agnesiu ", " quadriplegic ") x 2 = Causes (" agnesiu ", " preeclapsia ")

4 Docuent Sentence Span CONTEXT HIERARCHY Entity Candidate(A,B) Figure 3: Labeling functions take as input a Candidate object, representing a data point to be classified. Each Candidate is a tuple of Context objects, which are part of a hierarchy representing the local context of the Candidate. Data Model: A design challenge is anaging coplex, unstructured data in a way that enables SMEs to write labeling functions over it. In Snorkel, input data is stored in a context hierarchy. It is ade up of context types connected by parent/child relationships, which are stored in a relational database and ade available via an object-relational apping (ORM) layer built with SQLAlchey. 2 Each context type represents a conceptual coponent of data to be processed by the syste or used when writing labeling functions; for exaple a docuent, an iage, a paragraph, a sentence, or an ebedded table. Candidates i.e., data points x are then defined as tuples of contexts (Figure 3). Exaple 2.2. In our running CDR exaple, the input docuents can be represented in Snorkel as a hierarchy consisting of Docuents, each containing one or ore Sentences, each containing one or ore Spans of text. These Spans ay also be tagged with etadata, such as Entity arkers identifying the as cheical or disease entions (Figure 3). A candidate is then a tuple of two Spans. 2.1 A Language for Weak Supervision Snorkel uses the core abstraction of a labeling function to allow users to specify a wide range of weak supervision sources such as patterns, heuristics, external knowledge bases, crowdsourced labels, and ore. This higher-level, less precise input is ore efficient to provide (see Section 4.2), and can be autoatically denoised and synthesized, as described in subsequent sections. In this section, we describe our design choices in building an interface for writing labeling functions, which we envision as a unifying prograing language for weak supervision. These choices were infored to a large degree by our interactions priarily through weekly office hours with Snorkel users in bioinforatics, defense, industry, and other areas over the past year. 3 For exaple, while we initially intended to have a ore coplex structure for labeling functions, with anually specified types and correlation structure, we quickly found that siplicity in this respect was critical to usability (and not epirically detriental to our ability to odel their outputs). We also quickly discovered that users wanted either far ore expressivity or far less of it, copared to our first library of function teplates. We thus trade off expressivity and efficiency by allowing users to write labeling functions at two levels of abstraction: custo Python functions and declarative operators Hand-Defined Labeling Functions: In its ost general for, a labeling function is just an arbitrary snippet of code, usually written in Python, which accepts as input a Candidate object and either outputs a label or abstains. Often these functions are siilar to extract-transfor-load scripts, expressing basic patterns or heuristics, but ay use supporting code or resources and be arbitrarily coplex. Writing labeling functions by hand is supported by the ORM layer, which aps the context hierarchy and associated etadata to an object-oriented syntax, allowing the user to easily traverse the structure of the input data. Exaple 2.3. In our running exaple, we can write a labeling function that checks if the word causes appears between the cheical and disease entions. If it does, it outputs True if the cheical ention is first and False if the disease ention is first. If causes does not appear, it outputs None, indicating abstention: def LF causes (x): cs, ce = x. cheical. get word range () ds, de = x. disease. get word range () if ce < ds and " causes " in x. parent. words [ ce +1: ds ]: return True if de < cs and " causes " in x. parent. words [ de +1: cs ]: return False return None We could also write this with Snorkel s declarative interface: LF causes = lf search ("{{1}}. \ Wcauses\W. {{2}}", reverse args =False ) Declarative Labeling Functions: Snorkel includes a library of declarative operators that encode the ost coon weak supervision function types, based on our experience with users over the last year. These functions capture a range of coon fors of weak supervision, for exaple: Pattern-based: Pattern-based heuristics ebody the otivation of soliciting higher inforation density input fro SMEs. For exaple, pattern-based heuristics encopass feature annotations [51] and pattern-bootstrapping approaches [18, 20] (Exaple 2.3). Distant supervision: Distant supervision generates training labels by heuristically aligning data points with an external knowledge base, and is one of the ost popular fors of weak supervision [4, 22, 32]. Weak classifiers: Classifiers that are insufficient for our task e.g., liited coverage, noisy, biased, and/or trained on a different dataset can be used as labeling functions. Labeling function generators: One higher-level abstraction that we can build on top of labeling functions in Snorkel is labeling function generators, which generate ultiple labeling functions fro a single resource, such as crowdsourced labels and distant supervision fro structured knowledge bases (Exaple 2.4). Exaple 2.4. A challenge in traditional distant supervision is that different subsets of knowledge bases have different levels of accuracy and coverage. In our running exaple, we can use the Coparative Toxicogenoics Database (CTD) 4 as distant supervision, separately odeling different subsets of it with separate labeling functions. For exaple, 4

5 we ight write one labeling function to label a candidate True if it occurs in the Causes subset, and another to label it False if it occurs in the Treats subset. We can write this using a labeling function generator, LFs CTD = Ontology (ctd, {" Causes ": True, " Treats ": False }) which creates two labeling functions. In this way, generators can be connected to large resources and create hundreds of labeling functions with a line of code. 2.2 Generative Model The core operation of Snorkel is odeling and integrating the noisy signals provided by a set of labeling functions. Using the recently proposed approach of data prograing [5, 38], we odel the true class label for a data point as a latent variable in a probabilistic odel. In the siplest case, we odel each labeling function as a noisy voter which is independent i.e., akes errors that are uncorrelated with the other labeling functions. This defines a generative odel of the votes of the labeling functions as noisy signals about the true label. We can also odel statistical dependencies between the labeling functions to iprove predictive perforance. For exaple, if two labeling functions express siilar heuristics, we can include this dependency in the odel and avoid a double counting proble. We observe that such pairwise correlations are the ost coon, so we focus on the in this paper (though handling higher order dependencies is straightforward). We use our structure learning ethod for generative odels [5] to select a set C of labeling function pairs (j, k) to odel as correlated (see Section 3.2). Now we can construct the full generative odel as a factor graph. We first apply all the labeling functions to the unlabeled data points, resulting in a label atrix Λ, where Λ i,j = λ j(x i). We then encode the generative odel p w(λ, Y ) using three factor types, representing the labeling propensity, accuracy, and pairwise correlations of labeling functions: φ Lab i,j (Λ, Y ) = 1{Λ i,j } φ Acc i,j (Λ, Y ) = 1{Λ i,j = y i} φ Corr i,j,k(λ, Y ) = 1{Λ i,j = Λ i,k } (j, k) C For a given data point x i, we define the concatenated vector of these factors for all the labeling functions j = 1,..., n and potential correlations C as φ i(λ, Y ), and the corresponding vector of paraeters w R 2n+ C. This defines our odel: ( ) p w(λ, Y ) = Zw 1 exp w T φ i(λ, y i), where Z w is a noralizing constant. To learn this odel without access to the true labels Y, we iniize the negative log arginal likelihood given the observed label atrix Λ: ŵ = arg in log w Y p w(λ, Y ). We optiize this objective by interleaving stochastic gradient descent steps with Gibbs sapling ones, siilar to contrastive divergence [21]; for ore details, see [5, 38]. We use the Nubskull library, 5 a Python NUMBA-based Gibbs sapler. We then use the predictions, Ỹ = pŵ(y Λ), as probabilistic training labels Discriinative Model The end goal in Snorkel is to train a odel that generalizes beyond the inforation expressed in the labeling functions. We train a discriinative odel h θ on our probabilistic labels Ỹ by iniizing a noise-aware variant of the loss l(h θ (x i), y), i.e., the expected loss with respect to Ỹ : ˆθ = arg in θ E y Ỹ [l(h θ(x i), y)]. A foral analysis shows that as we increase the aount of unlabeled data, the generalization error of discriinative odels trained with Snorkel will decrease at the sae asyptotic rate as traditional supervised learning odels do with additional hand-labeled data [38], allowing us to increase predictive perforance by adding ore unlabeled data. Intuitively, this property holds because as ore data is provided, the discriinative odel sees ore features that cooccur with the heuristics encoded in the labeling functions. Exaple 2.5. The CDR data contains the sentence, Myasthenia gravis presenting as weakness after agnesiu adinistration. None of the 33 labeling functions we developed vote on the corresponding Causes(agnesiu, yasthenia gravis) candidate, i.e., they all abstain. However, a deep neural network trained on probabilistic training labels fro Snorkel correctly identifies it as a true ention. Snorkel provides connectors for popular achine learning libraries such as TensorFlow [2], allowing users to exploit coodity odels like deep neural networks that do not require hand-engineering of features and have robust predictive perforance across a wide range of tasks. 3. WEAK SUPERVISION TRADEOFFS We study the fundaental question of when and at what level of coplexity we should expect Snorkel s generative odel to yield the greatest predictive perforance gains. Understanding these perforance regies can help guide users, and introduces a tradeoff space between predictive perforance and speed. We characterize this space in two parts: first, by analyzing when the generative odel can be approxiated by an unweighted ajority vote, and second, by autoatically selecting the coplexity of the correlation structure to odel. We then introduce a two-stage, rulebased optiizer to support fast developent cycles. 3.1 Modeling Accuracies The natural first question when studying systes for weak supervision is, When does odeling the accuracies of sources iprove end-to-end predictive perforance? We study that question in this subsection and propose a heuristic to identify settings in which this odeling step is ost beneficial Tradeoff Space We start by considering the label density d Λ of the label atrix Λ, defined as the ean nuber of non-abstention labels per data point. In the low-density setting, sparsity of labels will ean that there is liited roo for even an optial weighting of the labeling functions to diverge uch fro the ajority vote. Conversely, as the label density

6 Modeling Advantage Low-Density (choose MV) Mid-Density (choose GM) Low-Density Bound Optiizer (A * ) Optial (A * ) Gen. Model (A w) High-Density (choose MV) # of Labeling Functions Figure 4: A plot of the odeling advantage, i.e., the iproveent in label accuracy fro the generative odel, as a function of the nuber of labeling functions (equivalently, the label density) on a synthetic dataset. 7 We plot the advantage obtained by a learned generative odel (GM), A w; by an optial odel A ; the upper bound à used in our optiizer; and the low-density bound (Proposition 1). grows, known theory confirs that the ajority vote will eventually be optial [27]. It is the iddle-density regie where we expect to ost benefit fro applying the generative odel. We start by defining a easure of the benefit of weighting the labeling functions by their true accuracies in other words, the predictions of a perfectly estiated generative odel versus an unweighted ajority vote: Definition 1. (Modeling Advantage) Let the weighted ajority vote of n labeling functions on data point x i be denoted as f w(λ i) = n j=1 wjλi,j, and the unweighted ajority vote (MV) as f 1(Λ i) = n j=1 Λi,j, where we consider the binary classification setting and represent an abstaining vote as 0. We define the odeling advantage A w as the iproveent in accuracy of f w over f 1 for a dataset: A w(λ, y) = 1 (1 {y if w(λ i) > 0 y if 1(Λ i) 0} 1 {y if w(λ i) 0 y if 1(Λ i) > 0}) In other words, A w is the nuber of ties f w correctly disagrees with f 1 on a label, inus the nuber of ties it incorrectly disagrees. Let the optial advantage A = A w be the advantage using the optial weights w (WMV*). To build intuition, we start by analyzing the optial advantage for three regies of label density (see Figure 6): Low Label Density: In this sparse setting, very few data points have ore than one non-abstaining label; only a sall nuber have ultiple conflicting labels. We have observed this occurring, for exaple, in the early stages of application developent. We see that with non-adversarial labeling functions (w > 0), even an optial generative odel (WMV*) can only disagree with MV when there are disagreeing labels, which will occur infrequently. We see that 7 We generate a class-balanced dataset of = 1000 data points with binary labels, and n independent labeling functions with average accuracy 75% and a fixed 10% probability of voting. Table 1: Modeling advantage A w attained using a generative odel for several applications in Snorkel (Section 4.1), the upper bound à used by our optiizer, the odeling strategy selected by the optiizer either ajority vote (MV) or generative odel (GM) and the epirical label density d Λ. Dataset A w (%) à (%) Modeling Strategy d Λ Radiology GM 2.3 CDR GM 1.8 Spouses GM 1.4 Che MV 1.2 EHR GM 1.2 the expected optial advantage will have an upper bound that falls quadratically with label density: Proposition 1. (Low-Density Upper Bound) Assue that P (Λ i,j 0) = p l i, j, and wj > 0 j. Then, the expected label density is d = np l, and E Λ,y,w [A ] = O ( d2 ) (1) Proof Sketch: We bound the advantage above by coputing the expected nuber of pairwise disagreeents. High Label Density: In this setting, the ajority of the data points have a large nuber of labels. For exaple, we ight be working in an extreely high-volue crowdsourcing setting, or an application with any highcoverage knowledge bases as distant supervision. Under odest assuptions naely, that the average labeling function accuracy α is greater than 50% it is known that the ajority vote converges exponentially to an optial solution as the average label density d increases, which serves as an upper bound for the expected optial advantage as well: Theore 1. (High-Density Upper Bound [27]) Assue that P (Λ i,j 0) = p l i, j, and that α = 1 n n j=1 α j = 1 n n j=1 1/(1 + exp(w j )) > 1 2. Then: E Λ,y,w [A ] e 2p l(α 1 2 ) 2 d Proof: This follows fro the result in [27] for the syetric Dawid-Skene odel under constant probability sapling. Mediu Label Density: In this iddle regie, we expect that odeling the accuracies of the labeling functions will deliver the greatest gains in predictive perforance because we will have any data points with a sall nuber of disagreeing labeling functions. For such points, the estiated labeling function accuracies can heavily affect the predicted labels. We indeed see gains in the epirical results using an independent generative odel that only includes accuracy factors φ Acc i,j (Table 1). Furtherore, the guarantees in [38] establish that we can learn the optial weights, and thus approach the optial advantage Autoatically Choosing a Modeling Strategy The bounds in the previous subsection iply that there are settings in which we should be able to safely skip odeling the labeling function accuracies, siply taking the unweighted ajority vote instead. However, in practice, the (2)

7 overall label density d Λ is insufficiently precise to deterine the transition points of interest, given a user tie-cost tradeoff preference (characterized by the advantage tolerance paraeter γ in Algorith 1). We show this in Table 1 using our application data sets fro Section 4.1. For exaple, we see that the Che and EHR label atrices have equivalent label densities; however, odeling the labeling function accuracies has a uch greater effect for EHR than for Che. Instead of siply considering the average label density d Λ, we instead develop a best-case heuristic based on looking at the ratio of positive to negative labels for each data point. This heuristic serves as an upper bound to the true expected advantage, and thus we can use it to deterine when we can safely skip training the generative odel (see Algorith 1). Let c y(λ i) = n j=1 1 {Λi,j = y} be the counts of labels of class y for x i, and assue that the true labeling function weights lie within a fixed range, w j [w in, w ax] and have a ean w. 8 Then, define: Φ(Λ i, y) = 1 {c y(λ i)w ax > c y(λ i)w in} à (Λ) = 1 1 {yf 1(Λ i) 0} Φ(Λ i, y)σ(2f w(λ i)y) y ±1 where σ( ) is the sigoid function, f w is ajority vote with all weights set to the ean w, and à (Λ) is the predicted odeling advantage used by our optiizer. Essentially, we are taking the expected counts of instances in which a weighted ajority vote could possibly flip the incorrect predictions of unweighted ajority vote under best case conditions, which is an upper bound for the expected advantage: Proposition 2. (Optiizer Upper Bound) Assue that the labeling functions have accuracy paraeters (logodds weights) w j [w in, w ax], and have E[w] = w. Then: E y,w [A Λ] à (Λ) (3) Proof Sketch: We upper-bound the odeling advantage by the expected nuber of instances in which WMV* is correct and MV is incorrect. We then upper-bound this by using the best-case probability of the weighted ajority vote being correct given (w in, w ax). We apply à to a synthetic dataset and plot in Figure 6. Next, we copute à for the labeling atrices fro experients in Section 4.1, and copare with the epirical advantage of the trained generative odels (Table 1). We see that our approxiate quantity à serves as a correct guide in all cases for deterining which odeling strategy to select, which for the ature applications reported on is indeed ost often the generative odel. However, we see that while EHR and Che have equivalent label densities, our optiizer correctly predicts that Che can be odeled with ajority vote, speeding up each pipeline execution by 1.8. We find in our applications that the optiizer can save execution tie especially during the initial stages of iterative developent (see full version). 8 We fix these at defaults of (w in, w, w ax) = (0.5, 1.0, 1.5), which corresponds to assuing labeling functions have accuracies between 62% and 82%, and an average accuracy of 73%. 3.2 Modeling Structure In this subsection, we consider odeling additional statistical structure beyond the independent odel. We study the tradeoff between predictive perforance and coputational cost, and describe how to autoatically select a good point in this tradeoff space. Structure Learning. We observe any Snorkel users writing labeling functions that are statistically dependent. Exaples we have observed include: Functions that are variations of each other, such as checking for atches against siilar regular expressions. Functions that operate on correlated inputs, such as raw tokens of text and their leatizations. Functions that use correlated sources of knowledge, such as distant supervision fro overlapping knowledge bases. Modeling such dependencies is iportant because they affect our estiates of the true labels. Consider the extree case in which not accounting for dependencies is catastrophic: Exaple 3.1. Consider a set of 10 labeling functions, where 5 are perfectly correlated, i.e., they vote the sae way on every data point, and 5 are conditionally independent given the true label. If the correlated labeling functions have accuracy α = 50% and the uncorrelated ones have accuracy β = 99%, then the axiu likelihood estiate of their accuracies according to the independent odel is ˆα = 100% and ˆβ = 50%. Specifying a generative odel to account for such dependencies by hand is ipractical for three reasons. First, it is difficult for non-expert users to specify these dependencies. Second, as users iterate on their labeling functions, their dependency structure can change rapidly, like when a user relaxes a labeling function to label any ore candidates. Third, the dependency structure can be dataset specific, aking it ipossible to specify a priori, such as when a corpus contains any strings that atch ultiple regular expressions used in different labeling functions. We observed users of earlier versions of Snorkel struggling for these reasons to construct accurate and efficient generative odels with dependencies. We therefore seek a ethod that can quickly identify an appropriate dependency structure fro the labeling function outputs Λ alone. Naively, we could include all dependencies of interest, such as all pairwise correlations, in the generative odel and perfor paraeter estiation. However, this approach is ipractical. For 100 labeling functions and 10,000 data points, estiating paraeters with all possible correlations takes roughly 45 inutes. When ultiplied over repeated runs of hyperparaeter searching and developent cycles, this cost greatly inhibits labeling function developent. We therefore turn to our ethod for autoatically selecting which dependencies to odel without access to ground truth [5]. It uses a pseudolikelihood estiator, which does not require any sapling or other approxiations to copute the objective gradient exactly. It is uch faster than axiu likelihood estiation, taking 15 seconds to select pairwise correlations to be odeled aong 100 labeling functions with 10,000 data points. However, this approach relies on a selection threshold hyperparaeter ɛ which induces a tradeoff space between predictive perforance and coputational cost.

8 Nuber of Correlations Siulated Labeling Functions Perforance # of Correlations Elbow Point Correlation Threshold Predictive Perforance (F1) Nuber of Correlations Cheical-Disease Labeling Functions Correlation Threshold Predictive Perforance (F1) Nuber of Correlations All User Study Labeling Functions Correlation Threshold Predictive Perforance (F1) Figure 5: Predictive perforance of the generative odel and nuber of learned correlations versus the correlation threshold ɛ. The selected elbow point achieves a good tradeoff between predictive perforance and coputational cost (linear in the nuber of correlations). Left: siulation of structure learning correcting the generative odel. Middle: the CDR task. Right: all user study labeling functions for the Spouses task Tradeoff Space Such structure learning ethods, whether pseudolikelihood or likelihood-based, crucially depend on a selection threshold ɛ for deciding which dependencies to add to the generative odel. Fundaentally, the choice of ɛ deterines the coplexity of the generative odel. 9 We study the tradeoff between predictive perforance and coputational cost that this induces. We find that generally there is an elbow point beyond which the nuber of correlations selected and thus the coputational cost explodes, and that this point is a safe tradeoff point between predictive perforance and coputation tie. Predictive Perforance: At one extree, a very large value of ɛ will not include any correlations in the generative odel, aking it identical to the independent odel. As ɛ is decreased, correlations will be added. At first, when ɛ is still high, only the strongest correlations will be included. As these correlations are added, we observe that the generative odel s predictive perforance tends to iprove. Figure 5, left, shows the result of varying ɛ in a siulation where ore than half the labeling functions are correlated. After adding a few key dependencies, the generative odel resolves the discrepancies aong the labeling functions. Figure 5, iddle, shows the effect of varying ɛ for the CDR task. Predictive perforance iproves as ɛ decreases until the odel overfits. Finally, we consider a large nuber of labeling functions that are likely to be correlated. In our user study (described in Section 4.2), participants wrote labeling functions for the Spouses task. We cobined all 125 of their functions and studied the effect of varying ɛ. Here, we expect there to be any correlations since it is likely that users wrote redundant functions. We see in Figure 5, right, that structure learning surpasses the best perforing individual s generative odel (50.0 F1). Coputational Cost: Coputational cost is correlated with odel coplexity. Since learning in Snorkel is done with a Gibbs sapler, the overhead of odeling additional correlations is linear in the nuber of correlations. The dashed lines in Figure 5 show the nuber of correlations included in each odel versus ɛ. For exaple, on the Spouses task, fitting the paraeters of the generative odel at ɛ = 0.5 takes 4 inutes, and fitting its paraeters with ɛ = Specifically, ɛ is both the coefficient of the l 1 regularization ter used to induce sparsity, and the iniu absolute weight in log scale that a dependency ust have to be selected. takes 57 inutes. Further, paraeter estiation is often run repeatedly during developent for two reasons: (i) fitting generative odel hyperparaeters using a developent set requires repeated runs, and (ii) as users iterate on their labeling functions, they ust re-estiate the generative odel to evaluate the Autoatically Choosing a Model Based on our observations, we seek to autoatically choose a value of ɛ that trades off between predictive perforance and coputational cost using the labeling functions outputs Λ alone. Including ɛ as a hyperparaeter in a grid search over a developent set is generally not feasible because of its large effect on running tie. We therefore want to choose ɛ before other hyperparaeters, without perforing any paraeter estiation. We propose using the nuber of correlations selected at each value of ɛ as an inexpensive indicator. The dashed lines in Figure 5 show that as ɛ decreases, the nuber of selected correlations follows a pattern. Generally, the nuber of correlations grows slowly at first, then hits an elbow point beyond which the nuber explodes, which fits the assuption that the correlation structure is sparse. In all three cases, setting ɛ to this elbow point is a safe tradeoff between predictive perforance and coputational cost. In cases where perforance grows consistently (left and right), the elbow point achieves ost of the predictive perforance gains at a sall fraction of the coputational cost. For exaple, on Spouses (right), choosing ɛ = 0.08 achieves a score of 56.6 F1 within one point of the best score but only takes 8 inutes for paraeter estiation. In cases where predictive perforance eventually degrades (iddle), the elbow point also selects a relatively sall nuber of correlations, giving an 0.7 F1 point iproveent and avoiding overfitting. Perforing structure learning for any settings of ɛ is inexpensive, especially since the search needs to be perfored only once before tuning the other hyperparaeters. On the large nuber of labeling functions in the Spouses task, structure learning for 25 values of ɛ takes 14 inutes. On CDR, with a saller nuber of labeling functions, it takes 30 seconds. Further, if the search is started at a low value of ɛ and increased, it can often be terinated early, when the nuber of selected correlations reaches a low value. Selecting the elbow point itself is straightforward. We use the point with greatest absolute difference fro its neighbors, but ore sophisticated schees can also be applied [43]. Our full optiization algorith for choosing a odeling strategy and (if necessary) correlations is shown in Algorith 1.

9 Algorith 1 Modeling Strategy Optiizer Input: Label atrix Λ (Y { }) n, advantage tolerance γ, structure search resolution η Output: Modeling strategy if à (Λ) < γ then return MV Structures [ ] for i fro 1 to 1 do 2η ɛ i η C LearnStructure(Λ, ɛ) Structures.append( C, ɛ) ɛ SelectElbowPoint(Structures) return GM ɛ 4. EVALUATION We evaluate Snorkel by drawing on deployents developed in collaboration with users. We report on two realworld deployents and four tasks on open-source data sets representative of other deployents. Our evaluation is designed to support the following three ain clais: Snorkel outperfors distant supervision baselines. In distant supervision [32], one of the ost popular fors of weak supervision used in practice, an external knowledge base is heuristically aligned with input data to serve as noisy training labels. By allowing users to easily incorporate a broader, ore heterogeneous set of weak supervision sources, Snorkel exceeds odels trained via distant supervision by an average of 132%. Snorkel approaches hand supervision. We see that by writing tens of labeling functions, we were able to approach or atch results using hand-labeled training data which took weeks or onths to asseble, coing within 2.11% of the F1 score of hand supervision on relation extraction tasks and an average 5.08% accuracy or AUC on cross-odal tasks, for an average 3.60% across all tasks. Snorkel enables a new interaction paradig. We easure Snorkel s efficiency and ease-of-use by reporting on a user study of bioedical researchers fro across the U.S. These participants learned to write labeling functions to extract relations fro news articles as part of a twoday workshop on learning to use Snorkel, and atched or outperfored odels trained on hand-labeled training data, showing the efficiency of Snorkel s process even for first-tie users. We now describe our results in detail. First, we describe the six applications that validate our clais. We then show that Snorkel s generative odeling stage helps to iprove the predictive perforance of the discriinative odel, deonstrating that it is 5.81% ore accurate when trained on Snorkel s probabilistic labels versus labels produced by an unweighted average of labeling functions. We also validate that the ability to incorporate any different types of weak supervision increentally iproves results with an ablation study. Finally, we describe the protocol and results of our user study. 4.1 Applications To evaluate the effectiveness of Snorkel, we consider several real-world deployents and tasks on open-source datasets Table 2: Nuber of labeling functions, fraction of positive labels (for binary classification tasks), nuber of training docuents, and nuber of training candidates for each task. Task # LFs % Pos. # Docs # Candidates Che ,753 65,398 EHR , ,607 CDR ,272 Spouses ,073 22,195 Radiology ,851 3,851 Crowd that are representative of other deployents in inforation extraction, edical iage classification, and crowdsourced sentient analysis. Suary statistics of the tasks are provided in Table 2. Discriinative Models: One of the key bets in Snorkel s design is that the trend of increasingly powerful, open-source achine learning tools (e.g., odels, pre-trained word ebeddings and initial layers, autoatic tuners, etc.) will only continue to accelerate. To best take advantage of this, Snorkel creates probabilistic training labels for any discriinative odel with a standard loss function. In the following experients, we control for end odel selection by using currently popular, standard choices across all settings. For text odalities, we choose a bidirectional long short ter eory (LSTM) sequence odel [17], and for the edical iage classification task we use a 50-layer ResNet [19] pre-trained on the IageNet object classification dataset [14]. Both odels are ipleented in Tensorflow [2] and trained using the Ada optiizer [24], with hyperparaeters selected via rando grid search using a sall labeled developent set. Final scores are reported on a held-out labeled test set. See full version for details. A key takeaway of the following results is that the discriinative odel generalizes beyond the heuristics encoded in the labeling functions (as in Exaple 2.5). In Section 4.1.1, we see that on relation extraction applications the discriinative odel iproves perforance over the generative odel priarily by increasing recall by 43.15% on average. In Section 4.1.2, the discriinative odel classifies entirely new odalities of data to which the labeling functions cannot be applied Relation Extraction fro Text We first focus on four relation extraction tasks on text data, as it is a challenging and coon class of probles that are well studied and for which distant supervision is often considered. Predictive perforance is suarized in Table 3. We briefly describe each task. Scientific Articles (Che): With odern online repositories of scientific literature, such as PubMed 10 for bioedical articles, research results are ore accessible than ever before. However, actually extracting fine-grained pieces of inforation in a structured forat and using this data to answer specific questions at scale reains a significant open challenge for researchers. To address this challenge in the 10

Predicting Time Spent with Physician

Predicting Time Spent with Physician Ji Zheng jizheng@stanford.edu Stanford University, Coputer Science Dept., 353 Serra Mall, Stanford, CA 94305 USA Ioannis (Yannis) Petousis petousis@stanford.edu Stanford University, Electrical Engineering

More information

Learning the topology of the genome from protein-dna interactions

Learning the topology of the genome from protein-dna interactions Learning the topology of the genoe fro protein-dna interactions Suhas S.P. Rao, SUnet ID: suhasrao Stanford University I. Introduction A central proble in genetics is how the genoe (which easures 2 eters

More information

Tucker, L. R, & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor

Tucker, L. R, & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor T&L article, version of 6/7/016, p. 1 Tucker, L. R, & Lewis, C. (1973). A reliability coefficient for axiu likelihood factor analysis. Psychoetrika, 38, 1-10 (4094 citations according to Google Scholar

More information

Performance Measurement Parameter Selection of PHM System for Armored Vehicles Based on Entropy Weight Ideal Point. Yuanhong Liu

Performance Measurement Parameter Selection of PHM System for Armored Vehicles Based on Entropy Weight Ideal Point. Yuanhong Liu nd International Conference on Coputer Engineering, Inforation Science & Application Technology (ICCIA 17) Perforance Measureent Paraeter Selection of PHM Syste for Arored Vehicles Based on Entropy Weight

More information

Adaptive visual attention model

Adaptive visual attention model H. Hügli, A. Bur, Adaptive Visual Attention Model, Proceedings of Iage and Vision Coputing New Zealand 2007, pp. 233 237, Hailton, New Zealand, Deceber 2007. Adaptive visual attention odel H. Hügli and

More information

Bivariate Quantitative Trait Linkage Analysis: Pleiotropy Versus Co-incident Linkages

Bivariate Quantitative Trait Linkage Analysis: Pleiotropy Versus Co-incident Linkages Genetic Epideiology 14:953!958 (1997) Bivariate Quantitative Trait Linkage Analysis: Pleiotropy Versus Co-incident Linkages Laura Alasy, Thoas D. Dyer, and John Blangero Departent of Genetics, Southwest

More information

Assessment of Human Random Number Generation for Biometric Verification ABSTRACT

Assessment of Human Random Number Generation for Biometric Verification ABSTRACT Original Article www.jss.ui.ac.ir Assessent of Huan Rando Nuber Generation for Bioetric Verification Elha Jokar, Mohaad Mikaili Departent of Engineering, Shahed University, Tehran, Iran Subission: 07-01-2012

More information

Results Univariable analyses showed that heterogeneity variances were, on average, increased among trials at

Results Univariable analyses showed that heterogeneity variances were, on average, increased among trials at Between-trial heterogeneity in eta-analyses ay be partially explained by reported design characteristics KM Rhodes 1, RM Turner 1,, J Savović 3,4, E Jones 3, D Mawdsley 5, JPT iggins 3 1 MRC Biostatistics

More information

A new approach for epileptic seizure detection: sample entropy based feature extraction and extreme learning machine

A new approach for epileptic seizure detection: sample entropy based feature extraction and extreme learning machine J. Bioedical Science and Engineering, 2010, 3, 556-567 doi:10.4236/jbise.2010.36078 Published Online June 2010 (http://www.scirp.org/journal/jbise/). A new approach for epileptic seizure detection: saple

More information

Fuzzy Analytical Hierarchy Process for Ecological Risk Assessment

Fuzzy Analytical Hierarchy Process for Ecological Risk Assessment Inforation Technology and Manageent Science Fuzzy Analytical Hierarchy Process for Ecological Risk Assessent Andres Radionovs 1 Oļegs Užga-Rebrovs 2 1 2 Rezekne Acadey of Technologies ISSN 2255-9094 (online)

More information

FAST ACQUISITION OF OTOACOUSTIC EMISSIONS BY MEANS OF PRINCIPAL COMPONENT ANALYSIS

FAST ACQUISITION OF OTOACOUSTIC EMISSIONS BY MEANS OF PRINCIPAL COMPONENT ANALYSIS FAST ACQUISITION OF OTOACOUSTIC EMISSIONS BY MEANS OF PRINCIPAL COMPONENT ANALYSIS P. Ravazzani 1, G. Tognola 1, M. Parazzini 1,2, F. Grandori 1 1 Centro di Ingegneria Bioedica CNR, Milan, Italy 2 Dipartiento

More information

Speech Enhancement Using Temporal Masking in the FFT Domain

Speech Enhancement Using Temporal Masking in the FFT Domain PAGE 8 Speech Enhanceent Using Teporal Masking in the FFT Doain Yao Wang, Jiong An, Teddy Surya Gunawan, and Eliathaby Abikairajah School of Electrical Engineering and Telecounications The University of

More information

The sensitivity analysis of hypergame equilibrium

The sensitivity analysis of hypergame equilibrium 3rd International Conference on Manageent, Education, Inforation and Control (MEICI 015) The sensitivity analysis of hypergae equilibriu Zhongfu Qin 1,a Xianrong Wei 1,b Jingping Li 1,c 1 College of Civil

More information

How Should Blood Glucose Meter System Analytical Performance Be Assessed?

How Should Blood Glucose Meter System Analytical Performance Be Assessed? 598599DSTXXX1.1177/1932296815598599Journal of Diabetes Science and TechnologySions research-article215 Coentary How Should Blood Glucose Meter Syste Analytical Perforance Be Assessed? Journal of Diabetes

More information

Fig.1. Block Diagram of ECG classification. 2013, IJARCSSE All Rights Reserved Page 205

Fig.1. Block Diagram of ECG classification. 2013, IJARCSSE All Rights Reserved Page 205 Volue 3, Issue 9, Septeber 2013 ISSN: 2277 128X International Journal of Advanced Research in Coputer Science and Software Engineering Research Paper Available online at: www.ijarcsse.co Autoatic Classification

More information

AUC Optimization vs. Error Rate Minimization

AUC Optimization vs. Error Rate Minimization AUC Optiization vs. Error Rate Miniization Corinna Cortes and Mehryar Mohri AT&T Labs Research 180 Park Avenue, Florha Park, NJ 0793, USA {corinna, ohri}@research.att.co Abstract The area under an ROC

More information

Matching Methods for High-Dimensional Data with Applications to Text

Matching Methods for High-Dimensional Data with Applications to Text Matching Methods for High-Diensional Data with Applications to Text Margaret E. Roberts, Brandon M. Stewart, and Richard Nielsen This draft: October 6, 2015 We thank the following for helpful coents and

More information

Sudden Noise Reduction Based on GMM with Noise Power Estimation

Sudden Noise Reduction Based on GMM with Noise Power Estimation J. Software Engineering & Applications, 010, 3: 341-346 doi:10.436/jsea.010.339 Pulished Online April 010 (http://www.scirp.org/journal/jsea) 341 Sudden Noise Reduction Based on GMM with Noise Power Estiation

More information

CHAPTER 7 THE HIV TRANSMISSION DYNAMICS MODEL FOR FIVE MAJOR RISK GROUPS

CHAPTER 7 THE HIV TRANSMISSION DYNAMICS MODEL FOR FIVE MAJOR RISK GROUPS CHAPTER 7 THE HIV TRANSMISSION DYNAMICS MODEL FOR FIVE MAJOR RISK GROUPS Chapters 2 and 3 have focused on odeling the transission dynaics of HIV and the progression to AIDS for hoosexual en. That odel

More information

Bayesian Networks Modeling for Crop Diseases

Bayesian Networks Modeling for Crop Diseases Bayesian Networs Modeling for Crop Diseases Chunguang Bi and Guifen Chen College of nforation & Technology, Jilin gricultural University, Changchun, China Bi_chunguan@126.co, guifchen@163.co bstract. Severe

More information

Optical coherence tomography (OCT) is a noninvasive

Optical coherence tomography (OCT) is a noninvasive Coparison of Optical Coherence Toography in Diabetic Macular Edea, with and without Reading Center Manual Grading fro a Clinical Trials Perspective Ada R. Glassan, 1 Roy W. Beck, 1 David J. Browning, 2

More information

Challenges and Implications of Missing Data on the Validity of Inferences and Options for Choosing the Right Strategy in Handling Them

Challenges and Implications of Missing Data on the Validity of Inferences and Options for Choosing the Right Strategy in Handling Them International Journal of Statistical Distributions and Applications 2017; 3(4): 87-94 http://www.sciencepublishinggroup.co/j/ijsda doi: 10.11648/j.ijsd.20170304.15 ISSN: 2472-3487 (Print); ISSN: 2472-3509

More information

Follicle Detection in Digital Ultrasound Images using Bidimensional Empirical Mode Decomposition and Fuzzy C-means Clustering Algorithm

Follicle Detection in Digital Ultrasound Images using Bidimensional Empirical Mode Decomposition and Fuzzy C-means Clustering Algorithm Follicle Detection in Digital Ultrasound Iages using Bidiensional Epirical Mode Decoposition and Fuzzy C-eans Clustering Algorith M.Jayanthi Rao @, Dr.R.Kiran Kuar # @ Research Scholar, Departent of CS,

More information

Toll Pricing. Computational Tests for Capturing Heterogeneity of User Preferences. Lan Jiang and Hani S. Mahmassani

Toll Pricing. Computational Tests for Capturing Heterogeneity of User Preferences. Lan Jiang and Hani S. Mahmassani Toll Pricing Coputational Tests for Capturing Heterogeneity of User Preferences Lan Jiang and Hani S. Mahassani Because of the increasing interest in ipleentation and exploration of a wider range of pricing

More information

Hierarchical Cellular Automata for Visual Saliency

Hierarchical Cellular Automata for Visual Saliency https://doi.org/.7/s263-7-62-2 Hierarchical Cellular Autoata for Visual Saliency Yao Qin Mengyang Feng 2 Huchuan Lu 2 Garrison W. Cottrell Received: 2 May 27 / Accepted: 26 Deceber 27 Springer Science+Business

More information

THYROID SEGMENTATION IN ULTRASOUND IMAGES USING SUPPORT VECTOR MACHINE

THYROID SEGMENTATION IN ULTRASOUND IMAGES USING SUPPORT VECTOR MACHINE International Journal of Neural Networks and Applications, 4(1), 2011, pp. 7-12 THYROID SEGMENTATION IN ULTRASOUND IMAGES USING SUPPORT VECTOR MACHINE D. Selvathi 1 and V. S. Sharnitha 2 Mepco Schlenk

More information

Automatic Seizure Detection Based on Wavelet-Chaos Methodology from EEG and its Sub-bands

Automatic Seizure Detection Based on Wavelet-Chaos Methodology from EEG and its Sub-bands Autoatic Seizure Detection Based on Wavelet-Chaos Methodology fro EEG and its Sub-bands Azadeh Abbaspour 1, Alireza Kashaninia 2, and Mahood Airi 3 1 Meber of Scientific Association of Electrical Eng.

More information

Brain Computer Interface with Low Cost Commercial EEG Device

Brain Computer Interface with Low Cost Commercial EEG Device Brain Coputer Interface with Low Cost Coercial EEG Device 1 * Gürkan Küçükyıldız, Suat Karakaya, Hasan Ocak and 2 Öer Şayli 1 Faculty of Engineering, Departent of Mechatronics Engineering, Kocaeli University,

More information

Keywords: meta-epidemiology; randomised trials; heterogeneity; Bayesian methods; Cochrane

Keywords: meta-epidemiology; randomised trials; heterogeneity; Bayesian methods; Cochrane Label-invariant odels for the analysis of eta-epideiological data KM Rhodes 1, D Mawdsley, RM Turner 1,3, HE Jones 4, J Savović 4,5, JPT Higgins 4 1 MRC Biostatistics Unit, School of Clinical Medicine,

More information

Research Article Association Patterns of Ontological Features Signify Electronic Health Records in Liver Cancer

Research Article Association Patterns of Ontological Features Signify Electronic Health Records in Liver Cancer Hindawi Journal of Healthcare Engineering Volue 2017, Article ID 6493016, 9 pages https://doi.org/10.1155/2017/6493016 Research Article Association Patterns of Ontological Features Signify Electronic Health

More information

Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale

Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale Stephen H. Bach 1 Daniel Rodriguez 2 Yintao Liu 2 Chong Luo 2 Haidong Shao 2 Cassandra Xia 2 Souvik Sen 2 Alexander Ratner

More information

Weak Supervision. Vincent Chen and Nish Khandwala

Weak Supervision. Vincent Chen and Nish Khandwala Weak Supervision Vincent Chen and Nish Khandwala Outline Motivation We want more labels! We want to program our data! #Software2.0 Weak Supervision Formulation Landscape of Noisy Labeling Schemes Snorkel

More information

Direct in situ measurement of specific capacitance, monolayer tension, and bilayer tension in a droplet interface bilayer

Direct in situ measurement of specific capacitance, monolayer tension, and bilayer tension in a droplet interface bilayer Electronic Suppleentary Material (ESI) for Soft Matter. This journal is The Royal Society of Cheistry 2015 Taylor et al. Electronic Supporting Inforation Direct in situ easureent of specific capacitance,

More information

Biomedical Research 2016; Special Issue: S178-S185 ISSN X

Biomedical Research 2016; Special Issue: S178-S185 ISSN X Bioedical Research 2016; Special Issue: S178-S185 ISSN 0970-938X www.bioedres.info A novel autoatic stepwise signal processing based coputer aided diagnosis syste for epilepsy-seizure detection and classification

More information

A Comparison of Poisson Model and Modified Poisson Model in Modelling Relative Risk of Childhood Diabetes in Kenya

A Comparison of Poisson Model and Modified Poisson Model in Modelling Relative Risk of Childhood Diabetes in Kenya Aerican Journal of Theoretical and Applied Statistics 2018; 7(5): 193-199 http://www.sciencepublishinggroup.co/j/ajtas doi: 10.11648/j.ajtas.20180705.15 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

DIET QUALITY AND CALORIES CONSUMED: THE IMPACT OF BEING HUNGRIER, BUSIER AND EATING OUT

DIET QUALITY AND CALORIES CONSUMED: THE IMPACT OF BEING HUNGRIER, BUSIER AND EATING OUT Working Paper 04-02 The Food Industry Center University of Minnesota Printed Copy $25.50 DIET QUALITY AND CALORIES CONSUMED: THE IMPACT OF BEING HUNGRIER, BUSIER AND EATING OUT Lisa Mancino and Jean Kinsey

More information

Evolution of Indirect Reciprocity by Social Information: The Role of

Evolution of Indirect Reciprocity by Social Information: The Role of 1 Title: Evolution of indirect reciprocity by social inforation: the role of Trust and reputation in evolution of altruis Author Affiliation: Mojdeh Mohtashei* and Lik Mui* *Laboratory for Coputer Science,

More information

A novel technique for stress recognition using ECG signal pattern.

A novel technique for stress recognition using ECG signal pattern. Curr Pediatr Res 2017; 21 (4): 674-679 ISSN 0971-9032 www.currentpediatrics.co A novel technique for stress recognition using ECG signal pattern. Supriya Goel, Gurjit Kau, Pradeep Toa Gauta Buddha University,

More information

The Roles of Beliefs, Information, and Convenience. in the American Diet

The Roles of Beliefs, Information, and Convenience. in the American Diet The Roles of Beliefs, Inforation, and Convenience in the Aerican Diet Selected Paper Presented at the AAEA Annual Meeting 2002 Long Beach, July 28 th -31 st Lisa Mancino PhD Candidate University of Minnesota

More information

Sodium Chloride Content in Ketchup by Precipitation Titration

Sodium Chloride Content in Ketchup by Precipitation Titration Background Sodiu Chloride Content in Ketchup by Precipitation Titration Sodiu chloride is one of the ost coon substances found in nature. Knowing the salt content in food products is iportant not only

More information

Data Mining Techniques for Performance Evaluation of Diagnosis in

Data Mining Techniques for Performance Evaluation of Diagnosis in ISSN: 2347-3215 Volue 2 Nuber 1 (October-214) pp. 91-98 www.ijcrar.co Data Mining Techniques for Perforance Evaluation of Diagnosis in Gestational Diabetes Srideivanai Nagarajan 1*, R.M.Chandrasekaran

More information

Automatic Speaker Recognition System in Adverse Conditions Implication of Noise and Reverberation on System Performance

Automatic Speaker Recognition System in Adverse Conditions Implication of Noise and Reverberation on System Performance Autoatic Speaer Recognition Syste in Adverse Conditions Iplication of Noise and Reverberation on Syste Perforance Khais A. Al-Karawi, Ahed H. Al-Noori, Francis F. Li, and Ti Ritchings Abstract Speaer recognition

More information

The Level of Participation in Deliberative Arenas

The Level of Participation in Deliberative Arenas The Level of Participation in Deliberative Arenas Autoria: Leonardo Secchi, Fabrizio Plebani Abstract This paper ais to contribute to the discussion on levels of participation in collective decision-aking

More information

Implications of ASHRAE s Guidance On Ventilation for Smoking-Permitted Areas

Implications of ASHRAE s Guidance On Ventilation for Smoking-Permitted Areas Copyright 24, Aerican Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc. This posting is by perission fro ASHRAE Journal. This article ay not be copied nor distributed in either paper

More information

Physical Activity Training for

Physical Activity Training for Physical Activity Training for Functional Mobility in Older Persons To Hickey Fredric M. Wolf Lynne S. Robins University of Michigan Marilyn B. Wagner Cleveland State University Wafa Harik Case Western

More information

OBESITY EPIDEMICS MODELLING BY USING INTELLIGENT AGENTS

OBESITY EPIDEMICS MODELLING BY USING INTELLIGENT AGENTS OBESITY EPIDEMICS MODELLING BY USING INTELLIGENT AGENTS Agostino G. Bruzzone, MISS DIPTEM University of Genoa Eail agostino@iti.unige.it - URL www.iti.unige.i t Vera Novak BIDMC, Harvard Medical School

More information

Policy Trap and Optimal Subsidization Policy under Limited Supply of Vaccines

Policy Trap and Optimal Subsidization Policy under Limited Supply of Vaccines olicy Trap and Optial Subsidization olicy under Liited Supply of Vaccines Ming Yi 1,2, Achla Marathe 1,3 * 1 Networ Dynaics and Siulation Science Laboratory, VBI, Virginia Tech, Blacsburg, Virginia, United

More information

Survival and Probability of Cure Without and With Operation in Complete Atrioventricular Canal

Survival and Probability of Cure Without and With Operation in Complete Atrioventricular Canal ORIGINAL ARTICLES Survival and Probability of Cure Without and With Operation in Coplete Atrioventricular Canal Thoas J. Berger, M.D., Eugene H. Blackstone, M.D., John W. Kirklin, M.D., L. M. Bargeron,

More information

Investigation of Binaural Interference in Normal-Hearing and Hearing-Impaired Adults

Investigation of Binaural Interference in Normal-Hearing and Hearing-Impaired Adults J A Acad Audiol 11 : 494-500 (2000) Investigation of Binaural Interference in Noral-Hearing and Hearing-Ipaired Adults Rose L. Allen* Brady M. Schwab* Jerry L. Cranford* Michael D. Carpenter* Abstract

More information

Dendritic Inhibition Enhances Neural Coding Properties

Dendritic Inhibition Enhances Neural Coding Properties Dendritic Inhibition Enhances Neural Coding Properties M.W. Spratling and M.H. Johnson Centre for Brain and Cognitive Developent, Birkbeck College, London, UK The presence of a large nuber of inhibitory

More information

Outlier Analysis. Lijun Zhang

Outlier Analysis. Lijun Zhang Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based

More information

Identification of Consumer Adverse Drug Reaction Messages on Social Media

Identification of Consumer Adverse Drug Reaction Messages on Social Media Association for Inforation Systes AIS Electronic Library (AISeL) PACIS 2013 Proceedings Pacific Asia Conference on Inforation Systes (PACIS) 6-18-2013 Identification of Consuer Adverse Drug Reaction Messages

More information

AIDS Epidemiology. Min Shim Math 164: Scientific Computing. April 30, 2004

AIDS Epidemiology. Min Shim Math 164: Scientific Computing. April 30, 2004 AIDS Epideiology Min Shi Math 64: Scientiic Coputing April 30, 004 Abstract Thopson s AIDS epideic odel, which is orulated in his article AIDS: The Misanageent o an Epideic, published in Great Britain,

More information

Beating by hitting: Group Competition and Punishment

Beating by hitting: Group Competition and Punishment Beating by hitting: Group Copetition and Punishent Eva van den Broek *, Martijn Egas **, Laurens Goes **, Arno Riedl *** This version: February, 2008 Abstract Both group copetition and altruistic punishent

More information

Noise Spectrum Estimation using Gaussian Mixture Model-based Speech Presence Probability for Robust Speech Recognition

Noise Spectrum Estimation using Gaussian Mixture Model-based Speech Presence Probability for Robust Speech Recognition Noise Spectru Estiation using Gaussian Mixture Model-bed Speech Presence Probability for Robust Speech Recognition M. J. Ala 2 P. Kenny P. Duouchel 2 D. O'Shaughnessy 3 CRIM Montreal Canada 2 ETS Montreal

More information

Application of Factor Analysis on Academic Performance of Pupils

Application of Factor Analysis on Academic Performance of Pupils Aerican Journal of Applied Matheatics and Statistics, 07, Vol. 5, No. 5, 64-68 Available online at http://pubs.sciepub.co/aas/5/5/ Science and Education Publishing DOI:0.69/aas-5-5- Application of Factor

More information

Development of new muscle contraction sensor to replace semg for using in muscles analysis fields

Development of new muscle contraction sensor to replace semg for using in muscles analysis fields Loughborough University Institutional Repository Developent of new uscle contraction sensor to replace semg for using in uscles analysis fields This ite was subitted to Loughborough University's Institutional

More information

A TWO-DIMENSIONAL THERMODYNAMIC MODEL TO PREDICT HEART THERMAL RESPONSE DURING OPEN CHEST PROCEDURES

A TWO-DIMENSIONAL THERMODYNAMIC MODEL TO PREDICT HEART THERMAL RESPONSE DURING OPEN CHEST PROCEDURES A TWO-DIMENSIONAL THERMODYNAMIC MODEL TO PREDICT HEART THERMAL RESPONSE DURING OPEN CHEST PROCEDURES F. G. Dias, J. V. C. Vargas, and M. L. Brioschi Universidade Federal do Paraná Departaento de Engenharia

More information

Selective Averaging of Rapidly Presented Individual Trials Using fmri

Selective Averaging of Rapidly Presented Individual Trials Using fmri Huan Brain Mapping 5:329 340(1997) Selective Averaging of Rapidly Presented Individual Trials Using fmri Anders M. Dale* and Randy L. Buckner Massachusetts General Hospital Nuclear Magnetic Resonance Center

More information

A CLOUD-BASED ARCHITECTURE PROPOSAL FOR REHABILITATION OF APHASIA PATIENTS

A CLOUD-BASED ARCHITECTURE PROPOSAL FOR REHABILITATION OF APHASIA PATIENTS Rev. Rou. Sci. Techn. Électrotechn. et Énerg. Vol. 62, 3, pp. 332 337, Bucarest, 2017 A CLOUD-BASED ARCHITECTURE PROPOSAL FOR REHABILITATION OF APHASIA PATIENTS DORIN CÂRSTOIU 1, VIRGINIA ECATERINA OLTEAN

More information

Outcome measures in palliative care for advanced cancer patients: a review

Outcome measures in palliative care for advanced cancer patients: a review Journal of Public Health Medicine Vol. 19, No. 2, pp. 193-199 Printed in Great Britain Outcoe easures in for advanced cancer s: a review Julie Hearn and Irene J. Higginson Suary Inforation generated using

More information

A simple practice guide for dose conversion between animals and human

A simple practice guide for dose conversion between animals and human Review Article A siple practice guide for dose conversion between anials and huan Abstract Understanding the concept of extrapolation of dose between species is iportant for pharaceutical researchers when

More information

(From the Department of Biology, St. Louis University, and the Department of Pathology, St. Louis University School of Medicine, St.

(From the Department of Biology, St. Louis University, and the Department of Pathology, St. Louis University School of Medicine, St. EFFECT OF ENZYME INHIBITORS AND ACTIVATORS ON THE MULTIPLICATION OF TYPHUS RICKETTSIAE II. 'remperatiyre, POTASSIII~ CYANIDE, AND TOLITIDIN ]3LIIE BY DONALD GREIFF, Sc.D., AND HENRY PINKERTON, M.D. (Fro

More information

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc. Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,

More information

Acetic Acid in Vinegar by Acid/Base Titration

Acetic Acid in Vinegar by Acid/Base Titration Acetic Acid in Vinegar by Acid/Base Titration Background Vinegar is typically ade fro the ferentation of alcoholic liquids such as wine. Its ain constituents are water and acetic acid, usually about 5

More information

Captioning Videos Using Large-Scale Image Corpus

Captioning Videos Using Large-Scale Image Corpus Du XY, Yang Y, Yang L et al. Captioning videos using large-scale iage corpus. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 32(3): 480 493 May 207. DOI 0.007/s3-07-738-7 Captioning Videos Using Large-Scale

More information

ASBESTOSIS AND PRIMARY INTRATHORACIC NEOPLASMS

ASBESTOSIS AND PRIMARY INTRATHORACIC NEOPLASMS ASBESTOSIS AND PRIMARY INTRATHORACIC NEOPLASMS Willia D. Buchanan Ministry of Labour, London, Gwat Britain The purpose of this paper is to describe one aspect of a study, which has been continued over

More information

LONG-TERM PROGNOSIS OF SEIZURES WITH ONSET IN CHILDHOOD LONG-TERM PROGNOSIS OF SEIZURES WITH ONSET IN CHILDHOOD. Patients

LONG-TERM PROGNOSIS OF SEIZURES WITH ONSET IN CHILDHOOD LONG-TERM PROGNOSIS OF SEIZURES WITH ONSET IN CHILDHOOD. Patients LONG-TERM PROGNOSIS OF SEIZURES WITH ONSET IN CHILDHOOD LONG-TERM PROGNOSIS OF SEIZURES WITH ONSET IN CHILDHOOD MATTI SILLANPÄÄ, M.D., PH.D., MERJA JALAVA, M.D., PH.D., OLLI KALEVA, B.SC., AND SHLOMO SHINNAR,

More information

Evaluation of an In-Situ Output Probe-Microphone Method for Hearing Aid Fitting Verification*

Evaluation of an In-Situ Output Probe-Microphone Method for Hearing Aid Fitting Verification* 0196/0202/90/1101-003 1$02.00/0 EAR AND HEARNG Copyright 0 1990 by The Willias & Wilkins Co. Vol., No. Printed in U. S. A. AMPLFCATON AND AURAL REHABLTATON Evaluation of an n-situ Output Probe-Microphone

More information

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U T H E U N I V E R S I T Y O F T E X A S A T D A L L A S H U M A N L A N

More information

Paul Bennett, Microsoft Research (CLUES) Joint work with Ben Carterette, Max Chickering, Susan Dumais, Eric Horvitz, Edith Law, and Anton Mityagin.

Paul Bennett, Microsoft Research (CLUES) Joint work with Ben Carterette, Max Chickering, Susan Dumais, Eric Horvitz, Edith Law, and Anton Mityagin. Paul Bennett, Microsoft Research (CLUES) Joint work with Ben Carterette, Max Chickering, Susan Dumais, Eric Horvitz, Edith Law, and Anton Mityagin. Why Preferences? Learning Consensus from Preferences

More information

HIGH-PRECISION BIDECADAL CALIBRATION OF THE RADIOCARBON TIME SCALE, BC

HIGH-PRECISION BIDECADAL CALIBRATION OF THE RADIOCARBON TIME SCALE, BC [RADIOCARBON, VOL. 35, No. 1, 1993, P. 25-33] HIGH-PRECISION BIDECADAL CALIBRATION OF THE RADIOCARBON TIME SCALE, 5-25 BC GORDON W. PEARSON Retired fro Palaeoecology Centre, The Queen's University of Belfast,

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Application guide. High speed migration SYSTIMAX InstaPATCH 360 and SYSTIMAX Ultra-Low-Loss configuration guideline

Application guide. High speed migration SYSTIMAX InstaPATCH 360 and SYSTIMAX Ultra-Low-Loss configuration guideline Application guide High speed igration SYSTIMAX InstaPATCH 360 and SYSTIMAX Ultra-Low-Loss configuration guideline Contents Contents... 2 SYSTIMAX preterinated fiber-optic cabling systes configuration guide...

More information

An original approach to the diagnosis of scolineinduced

An original approach to the diagnosis of scolineinduced J. clin Path., 1972, 25, 422-426 An original approach to the diagnosis of scolineinduced apnoea A. FSHTAL, R. T. EVANS, AND C. N. CHAPMAN Fro the Departent ofpathology, Southead General Hospital, Bristol

More information

Kinetic Study of Gluconic Acid Batch Fermentation by Aspergillus niger

Kinetic Study of Gluconic Acid Batch Fermentation by Aspergillus niger World Acadey of cience, Engineering and Technology 7 29 Kinetic tudy of Gluconic Acid Batch Ferentation by Aspergillus niger Akbarningru Fatawati, Rudy Agustriyanto, and Lindawati Abstract Gluconic acid

More information

7-13 & A

7-13 & A GCEC-340 Table of Contents Diensions........................... p. 2 Reference Drawings................... p. 3 Iportant Safety Instructions........... p. 4 Before You Begin...................... p. 5

More information

Effects of Electrode Montage on the Spectral Composition of the Infant Auditory Brainstem Response

Effects of Electrode Montage on the Spectral Composition of the Infant Auditory Brainstem Response J A Acad Audiol 7 : 269-273 (1996) ffects of lectrode Montage on the Spectral Coposition of the Infant Auditory Brainste Response Bharti Katbana* David A. Metz* Shari L. Bennett* Patricia A. Doklert Abstract

More information

II 111 I l~1~i:ntation PAGE IM o 7408

II 111 I l~1~i:ntation PAGE IM o 7408 ~AD-A237 95 S T R D A 3 5 For Approved RE II II 111 I l~1~i:ntation PAGE IM o 748 1a. REP lb RESTRICTIVE MARKINGS NT, NA 2a. SECURITY CLASSIFICATION,'AUThORITY ' 3 DISTRIBUTION /AVAILABILITY OF REPORT

More information

Did Modeling Overestimate the Transmission Potential of Pandemic (H1N1-2009)? Sample Size Estimation for Post-Epidemic Seroepidemiological Studies

Did Modeling Overestimate the Transmission Potential of Pandemic (H1N1-2009)? Sample Size Estimation for Post-Epidemic Seroepidemiological Studies Georgia State University ScholarWorks @ Georgia State University Public Health Faculty Publications School of Public Health 2011 Did Modeling Overestiate the Transission Potential of Pandeic (H1N1-2009)?

More information

Kinetic Study of Gluconic Acid Batch Fermentation by Aspergillus niger

Kinetic Study of Gluconic Acid Batch Fermentation by Aspergillus niger World Acadey of cience, Engineering and Technology International Journal of Cheical and Molecular Engineering Vol:3, No:9, 29 Kinetic tudy of Gluconic Acid Batch Ferentation by Aspergillus niger Akbarningru

More information

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Riccardo Miotto and Chunhua Weng Department of Biomedical Informatics Columbia University,

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

Computational modelling of amino acid transfer interactions in the placenta

Computational modelling of amino acid transfer interactions in the placenta Exp Physiol 95.7 pp 829 840 829 Experiental Physiology Research Paper Coputational odelling of aino acid transfer interactions in the placenta B. G. Sengers 1,C.P.Please 2 andr.m.lewis 3 1 Bioengineering

More information

CRITICAL REVIEW OF EPILEPTIC PREDICTION MODEL USING EEG

CRITICAL REVIEW OF EPILEPTIC PREDICTION MODEL USING EEG CRITICAL REVIEW OF EPILEPTIC PREDICTION MODEL USING EEG Anju Shaikh 1 and Dr.Mukta Dhopeshwarkar 2 1 Assistant Professor,Departent Of Coputer Science & IT,Deogiri College, Aurangabad, India. 2 Assistant

More information

Transient Evoked Otoacoustic Emissions and Pseudohypacusis

Transient Evoked Otoacoustic Emissions and Pseudohypacusis A Acad Audiol 6 : 293-31 (1995) Transient Evoked Otoacoustic Eissions and Pseudohypacusis Frank E. Musiek* Steven P. Bornsteint Willia F. Rintelann$ Abstract The audiologic diagnosis of pseudohypacusis

More information

Spatiotemporal Dynamics of Tuberculosis Disease and Vaccination Impact in North Senatorial Zone Taraba State Nigeria

Spatiotemporal Dynamics of Tuberculosis Disease and Vaccination Impact in North Senatorial Zone Taraba State Nigeria IOSR Journal of Matheatics (IOSR-JM ISSN: 78-578. Volue, Issue (Sep-Oct. 1, PP 1- Spatioteporal Dynaics of Tuberculosis Disease and Vaccination Ipact in North Senatorial Zone Taraba State Nigeria 1 A.

More information

1. Introduction. {< > 1, < > 2 }

1. Introduction.   {< > 1, < > 2 } Variability of QRS Signal in Electrocardiogras Using Wavelets R.Shantha Selva Kuari S.Issac iwas Dr.V.Sadasiva 3 Selection grade lecturer ECE Departent epco Schlenk Engg. College Sivakasi. III seester.e.

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type tet] [Type tet] [Type tet] ISSN : 0974-7435 Volue 10 Issue 8 BioTechnology 014 An Indian Journal FULL PAPER BTAIJ, 10(8), 014 [714-71] Factor analysis-based Chinese universities aerobics sustainable

More information

interactions (mechanism of folding/contact free energies/range of interactions/monte Carlo)

interactions (mechanism of folding/contact free energies/range of interactions/monte Carlo) Proc. Nat. Acad. Sci. USA Vol. 72, No. 1, pp. 382386, October 1975 Cheistry Model of protein folding: Inclusion of short, ediu, and longrange interactions (echanis of folding/contact free energies/range

More information

plant tissue. III. The role of the root cortex cells

plant tissue. III. The role of the root cortex cells the Acta Bot. Neerl. 22(5), October 1973, p. 529542. Diffusion and absorption of ions in plant tissue. III. The role of the root cortex cells 1 in ion absorption G.G.J. Bange Botanisch Laboratoriu, Leiden

More information

Lung Volume Reduction Surgery Using the NETT Selection Criteria

Lung Volume Reduction Surgery Using the NETT Selection Criteria Lung Volue Reduction Surgery Using the NETT Selection Criteria Mark E. Ginsburg, MD, Byron M. Thoashow, MD, Chun K. Yip, MD, Angela M. DiMango, MD, Roger A. Maxfield, MD, Matthew N. Bartels, MD, Patricia

More information

Community Health Environment Scan Survey (CHESS): a novel tool that captures the impact of the built environment on lifestyle factors

Community Health Environment Scan Survey (CHESS): a novel tool that captures the impact of the built environment on lifestyle factors æstudy DESIGN ARTICLE Counity Health Environent Scan Survey (CHESS): a novel tool that captures the ipact of the built environent on lifestyle factors Fiona Wong 1 *, Denise Stevens 1, Kathleen O Connor-Duffany

More information

Strain-rate Dependent Stiffness of Articular Cartilage in Unconfined Compression

Strain-rate Dependent Stiffness of Articular Cartilage in Unconfined Compression L. P. Li* Biosyntech Inc., 475 Arand-Frappier Blvd., Park of Science and High Technology Laval, Quebec, Canada H7V 4B3 M. D. Buschann Departent of Cheical Engineering and Institute of Bioedical Engineering,

More information

Evolutionary and Ecological Perspectives on Systems Diseases using Agent-based Modeling

Evolutionary and Ecological Perspectives on Systems Diseases using Agent-based Modeling Evolutionary and Ecological Perspectives on Systes Diseases using Agent-based Modeling Swarfest 2014 Notre Dae University South Bend, IN, June 30, 2014 Gary An, MD Associate Professor of Surgery Departent

More information

MODELING ABOVEGROUND BIOMASS ACCUMULATION OF COTTON ABSTRACT

MODELING ABOVEGROUND BIOMASS ACCUMULATION OF COTTON ABSTRACT Jia et al., The Journal of Anial & Plant Sciences, 4(1): 014, Page: J. 80-89 Ani. Plant Sci. 4(1):014 ISSN: 1018-7081 MODELING ABOVEGROUND BIOMASS ACCUMULATION OF COTTON B. Jia, H. B. He, F. Y. Ma *, M.

More information

INTERNATIONAL JOURNAL OF PHARMACEUTICAL RESEARCH AND BIO-SCIENCE

INTERNATIONAL JOURNAL OF PHARMACEUTICAL RESEARCH AND BIO-SCIENCE INTERNATIONAL JOURNAL OF PHARMACEUTICAL RESEARCH AND BIO-SCIENCE NEW METHOD DEVELOPMENT AND VALIDATION OF UV-SPECTROPHOTOMETER FOR THE ESTIMATION OF PALIPERIDONE IN BULK AND PHARMACEUTICAL DOSAGE FORM

More information

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation

Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation Aldebaro Klautau - http://speech.ucsd.edu/aldebaro - 2/3/. Page. Frequency Tracking: LMS and RLS Applied to Speech Formant Estimation ) Introduction Several speech processing algorithms assume the signal

More information

Chapter 4: Nutrition. Teacher s Guide

Chapter 4: Nutrition. Teacher s Guide Chapter 4: Nutrition Teacher s Guide 55 Chapter 4: Nutrition Teacher s Guide Learning Objectives Students will explain two ways that nutrition affects health Students will describe the function of 5 iportant

More information