Extracting Events with Informal Temporal References in Personal Histories in Online Communities

Size: px
Start display at page:

Download "Extracting Events with Informal Temporal References in Personal Histories in Online Communities"

Transcription

1 Extracting Events with Informal Temporal References in Personal Histories in Online Communities First Author Affiliation / Address line 1 Affiliation / Address line 2 Affiliation / Address line 3 @domain Second Author Affiliation / Address line 1 Affiliation / Address line 2 Affiliation / Address line 3 @domain Abstract We present a distantly supervised system for extracting the dates of illness events from posting histories in the context of an online medical support community. We annotate a corpus containing the complete posting histories of 300 users, with a total of 509 event dates. A temporal tagger is designed to retrieve candidate dates from non-standard language in social media. Our evaluation demonstrates that the temporal tagger outperforms a state-of-the-art baseline tagger. The event extraction system learns the likelihood of candidate dates for events from timerich sentences and temporal constraints from event-related sentences. Our model achieves 89.7% of the maximum possible performance given the performance of the temporal expression retrieval step. 1 Introduction In this paper we present a challenging new event date extraction task that requires extraction and resolution of temporal expression forms that are beyond the capabilities of the best state-of-the-art approach. Specifically, it targets extraction of personal illness event mentions in the posting histories of online community participants. In this context, we present a novel event date extraction approach that achieves high accuracy on this difficult new task. At its heart, the technique builds on a temporal expression (TE) retrieval approach that beats the most effective previously published approach (Chang and Manning, 2012) in the context of this novel task. Building on this foundation, the event date extraction technique achieves 89.7% of the maximum possible performance given the performance of the TE retrieval step. Social media enables measurement of human behavior in a longitudinal manner, and thus is an informative tool in the study of how people experience and respond to important events (Brubaker et al., 2012; Choudhury et al., 2013). Extracted user-specific event dates can form a timeline that offers the opportunity to examine an individual s behavior as it changes over time in response to these events. This work has typically involved associating users with events and event temporal information manually (Park and Choi, 2012). Because of the tremendous time involved in such analysis, these investigations have typically adopted small case study methodologies (Wen et al., 2012). Recent work has explored alternative methodologies such as Amazon Mechanical Turk ( (Choudhury et al., 2013), which may suffer from low recall. Despite the value of user-specific temporally scoped event information and the considerable research that has been done in temporal information extraction, to date the existing widely used resources are designed for learning temporally scoped facts about public figures/organizes from newswire or Wikipedia articles (Ji et al., 2011; McClosky and Manning, 2012; Garrido et al., 2012). Applying these approaches in a social media context poses a number of significant challenges that we address in this paper. We begin with the production of an annotated corpus of cancer event (CE) dates from user posts in Breastcancer.org. Next we describe a temporal expression extractor, which is a critical component in most of the existing temporal information extraction work. Our temporal extractor resolves nonstandard and user-specific date references in social media. While prior work has leveraged predictable ordering constraints between events, illness events may not follow a predictable sequence. For example, in our corpus we find that cancer events such as chemotherapy, radiation, and surgery are recurring and can appear almost random in order. Thus, it is less straightforward to utilize logical constraints in terms of temporal relations between different types of events as has been valuable in previous work (McClosky and Manning, 2012). In our work, we leverage a new source of constraint. In particular, we demonstrate how to learn temporal constraints from the temporal relation that exists between document creation time (DCT, in this paper DCT is the time of a post) and the event mentioned in the post.

2 In what follows, we first review related work in Section 2. We then describe the task and the corpus annotation in Sections 3 and 4 respectively. Next, we demonstrate how to tackle the noisiness of social media language and reliably extract the event dates in Section 5. We also learn time constraints to choose the event date from multiple sources, utilizing both the sentences containing date expressions and sentences in context that may be relevant to the event. The results are presented in Section 6. 2 Related Work Previous work on TE extraction has focused mainly on newswire, Wikipedia and blogs (Strotgen and Gertz, 2010; Chang and Manning, 2012). It is not surprising that there would be difficulty in applying the same tagger to a different genre such as forum posts (Strotgen and Gertz, 2012; Kolomiyets et al., 2011). This paper presents a rule-based TE extractor that identifies and resolves a wider range of nonstandard TEs than earlier state-of-art temporal taggers. Here we review that prior work. 2.1 Temporally Anchored Relation Extraction Our task is closest to the temporal slot filling track in the TAC-KBP 2011 shared task (Ji et al., 2011) and timelining task described earlier (McClosky and Manning, 2012). The goal is to extract the temporal bounds of event-based relations. One key difference is that they used newswire, Wikipedia and blogs as data sources from which they extract temporal bounds of facts found in Wikipedia infoboxes or the Freebase database ( Similar to Mc- Closky et al. (2012), our system uses the classifier s marginal probabilities along with a constraint component. We also demonstrate that this method performs better than utilizing hard constraints as used in several Temporal KBP slot filling systems (Artiles et al., 2011; Garrido et al., 2011). 2.2 Learning temporal constraints Temporal constraints have proven to be useful for production of a globally consistent timeline. In most temporal relation bound extraction systems, the constraints are included as input rather than learned by the system (Talukdar et al., 2012; Wang et al., 2011). A notable exception is Mc- Closky et al. (2012) who developed an approach to learning constraints such as that people cannot attend school if they have not been born yet. Constraints are somewhat less straightforward in our task, however. Since Cancer Events (CEs) can be recurring and appear to have almost a random order, there can be no universal logical constraints on the order of cancer events such as that Chemotherapy should happen before Mastectomy. Instead, we utilize the temporal relations expressed between events and DCT to infer constraints on the corresponding event dates. Garrido et al. (2012) make use of DCT as well, however, they have assumed DCT is within the time-range of the event stated in the document. We do not do this since this is often not true for our data. What we do is learn the relations as a pair-wise classification task. Similar to existing methods that extract the temporal relations between event and DCT (Chambers et al., 2007; Yoshikawa et al., 2009; UzZaman and Allen, 2010), we also design new features to remove noise. 2.3 Temporal reasoning with medical data There has been much recent interest in building timelines of medical events from clinical narratives. Zhou and Hripcsak (2007) provide a comprehensive survey of temporal reasoning with clinical data. Extracting precise temporal features for highly accurate temporal ordering of medical events is difficult as the temporal relationship between medical events is varied and complicated. Raghavan et al. (2012) demonstrate that the resources widely used for learning temporal relations in newswire text do not work well on clinical text. 3 Task As the foundation for our novel task definition, we first describe the temporal KBP task (Ji et al., 2011). In that task, one is first given a list of queries, a database of example relations with temporal bounds, and source documents. Queries are named entities with their gold relations but no temporal bounds. The database consists of training entities and their relations with temporal bounds. For example, an entity might be Bill Clinton, with relation as is US president, and temporal bound as [1993, 2001]. Source documents consist of raw text from news, blogs and Wikipedia articles. For each query, systems are required to output their predicted temporal bounds. Our task is similar to the Temporal KBP task, although it is more challenging. It is similar in the sense that we are identifying temporal bounds for events mentioned in a source document collection. In our case, the query is the user s userid paired with any one of our CEs. Source documents come from each user s posts and profile. A key difference is that in the KBP task, the set of relevant gold relations are provided as input, and each of those relations should correspond to exactly

3 one event in the source document set, although it may be mentioned more than once. In our task, we provide a set of potential events. However, for each event, there may be zero or more occurrences within each user s history (e.g., a patient may have had chemotherapy more than once), and where each event that has occurred may have been mentioned any number of times. Also the event date may not be identifiable from the source documents, i.e. the user s posts, even if the event was mentioned. If there were zero occurrences of the event or the event date cannot be inferred from the source documents, the system should output unknown. Otherwise, the desired output is the extracted user s event date. <11/15/2008> I only have a mast on my left side, but I have noticed some pulling recently and I won't start rads until March [Underspecified]. <11/20/2008> It is sloowwwly healing, so slowly, in fact, that she said she HOPES it will be healed by March [Underspecified]. when I am supposed to start rads. <1/13/2009> I still have one last chemo to go on the 19th [Underspecified] and then start rads in 5 wks [Relative]. <1/31/2009> I go for my first meeting with the rad onc on 2/10 [Underspecified] (my 50th birthday [User-specific]!). <2/23/2009> I had my first rad today [Relative]. <3/31/2009> Tomorrow [Relative] will be my last full rads, then I will start the boosts. <4/2/2009> I started rads in Feb [Underspecified]. I just did #29 today [Relative]. <4/8/2009> The rad onc wants to see me again next week [Relative] for a skin check and I am on antibiotics since I am at risk of getting cellulitis since I have had it twice since August [Underspecified]. <6/21/2010> My friend Lisa had her port put in last week [Relative] and will begin 2 weeks [Relative] of radiation on Tuesday [Underspecified]. Figure 1: Excerpts from messages of a user that contains a Radiation event keyword. Cancer events keywords are in bold and date expressions are in italics. Posts from a Breastcancer.org user may contain scattered mentions of CEs including conditions along with some information on when they occurred. Much of the temporal information in the messages is informal, imprecise or embedded in complex relative TEs. Some sample message excerpts are shown in Figure 1. This user begins to post about radiation before she starts it. She first reports planning to start radiation in March, but then reschedules for February. A lot of Relative and Underspecified TEs are used around the CE keyword instead of Absolute TE. She also shares her friend s CE story. These all pose challenges for our task. In Section 3.1 and 3.2, we explain how we define CEs and their temporal representation. In Section 3.3, we illustrate how the TEs in our corpus are different from newswire text. 3.1 Cancer event representation The concept of CE is different from the events defined in the widely used Timebank corpus for temporal information learning (Pustejovsky et al., 2003). In Timebank, the notion of an event primarily consists of verbs or phrases that denote change in state. In contrast, in prior work on extraction of Medical Events (ME) from clinical texts as well as CEs in Breastcancer.org, the target events are more often expressed as nouns than as verbs. For example, previous work on temporal reasoning in clinical texts has defined MEs as temporally-associated concepts that describe a medical condition affecting the patient s health, or procedures performed on a patient, such as fever and vancomycin pain control (Zhou and Hripcsak, 2007). CEs are a subset of MEs. While there is prior work on reasoning about MEs that builds on a representation of extracted events (Raghavan et al., 2012), the MEs in this previous work are manually annotated as input to subsequent processes, as opposed to our work where we extract CEs automatically. As our ultimate goal is to study user s behavior changes around major events, we focus specifically on eight common breast cancer events that are known to be stressful experiences. The CEs fall into three groups shown below. Status change event: breast cancer Diagnosis, Metastasis and Recurrence. These are state changing events associated with the moment when the news is declared to the patient. One-day event: Mastectomy, Lumpectomy, Reconstruction. These three events are all breast cancer-related surgeries. So they typically start and finish in the same day. Event with a temporal bound: Chemotherapy and Radiation. Both of these events are common cancer treatments which last from several weeks to several months. For each CE, we manually design an event keyword set. The keyword set includes the name of the event, abbreviations, slang, aliases and other related words. For example, the Reconstruction keyword set contains the common surgery type names, such as DIEP (deep inferior epigastric perforator flap breast reconstruction). By creating an analogous keyword set, our event date extraction method could be easily adapted to other datasets. 3.2 Temporal representation of cancer events In Temporal KBP, the temporal representation allows for upper and lower bounds on both the event start and end: <s l, s u, e l, e u > where s l

4 start s u, e l end e u. However, it is difficult to obtain these bounds from the posts even for humans. Furthermore, in our case, there are no analogous resources like Freebase or Wikipedia infoboxes to obtain additional data from as used in earlier KBP work. In contrast, McClosky et al. (2012) opted for a simpler representation of the form <start, end>. We adopted a similarly simple representation that is in the same spirit as the time duration based representation of MEs used earlier (Zhou and Hripcsak, 2007). The start and end dates are the same for a one-day event. Thus we only extract a single date in that case. For a status change event, the end date is infinitely far into the future, or at least until death, so we only extract the start date (This is called a semi-interval problem in (Zhou and Hripcsak, 2007)). For events with a time bound, we transform them into a one-day event by treating the start and the end of the event as two separate events. For example, the start and end of chemotherapy are regarded as two separate events. Then for each event we extract a single date. 3.3 Unusual date expressions in our corpus The distribution of TE types found in social media is different than that in newswire text. Table 1 presents a classification of TEs, which we classify according to the way we resolve the TE to a calendar date (year and month). Absolute TE does not need resolution. We need to resolve the underspecified part of the Underspecified TE. Relative TEs need to be resolved with a reference date. User-specific TE can be resolved with userspecific information, such as birth date. Our TE retrieval system extracted 195,143 TEs from our entire corpus. According to (Pustejovsky et al., 2003), in newswire and Wikipedia articles, the frequency of DATE TEs (which corresponds to Absolute and Underspecified in Table 1) is 68.5%, and thus the simplest type of TE to resolve constitute the bulk of temporal references. However, they only account for 48% in our corpus. Relative or user-specific dates, which are more challenging to resolve, are slightly more prevalent than this set in our data (e.g., last month rather than on July 3rd, 2007 ). In particular, user-specific TEs such as the user s age or cancer anniversary account for 5% of the data. It is especially common that the user specifies her diagnosis age. Many of the TEs are written in a casual way even for Absolute TEs. For example, Feb 11, Feb/11, 02/2011 and (in) 2/11 all represent the same date February When using Relative TEs, year, week and month can be abbreviated as yr, wk and mo(n). To the best of our knowledge, these TEs are not recognized by any temporal tagger available. Type of TE Example Percentage Absolute in July Underspecified in July 24 Relative two months ago 47 User-specific at the age of 57 5 Table 1: Types of temporal expression (TE) in our corpus. 4 Corpus Annotation Our research site, Breastcancer.org, is one of the most popular breast cancer discussion boards online. We have collected all the public posts, users, and their profiles on the discussion board platform from October 2001 to January During this period there were a total of 90,242 unique users who posted 1,562,459 messages. From this set of posts we extracted and then annotated two separate corpora, one for temporal expression retrieval, the other one for event date extraction(corpus link will be provided in full paper). The temporal expression corpus consists of 1,000 posts. As each individual user tends to use a narrow range of date expression forms, we randomly draw 1,000 users and randomly pick one post from each user in order to cover a wider variety of temporal expressions. We manually extracted 601 date expressions from these posts and resolved them to a specific month and year or just year if the month is not mentioned. Our corpus for event date extraction consists of 300 users that are randomly drawn from our dataset. 509 events are annotated with dates. Each user posted a mean of 54 messages in the forum. The granularity in resolved temporal references is the month. The annotators were provided with guidelines for how to infer the date of the event(coding manual link will be provided in full paper). The annotators were instructed to read every post of the user and extract the event dates from the most precise report. For example, when extracting a user s diagnosis date, if the user has only one post about her diagnosis which says: <6/10/2010> I was diagnosed last year. Then the diagnosis date will be But if the user has another post which contains more precise date information of her diagnosis date: <5/20/2009> I was diagnosed two weeks ago. Then the diagnosis date should be 5/2009. To measure the annotation agreement, three annotators annotated 10 users complete posting history (23 event dates were extracted in total). We evaluated the reliability of the hand extraction by first listing for each annotator the complete set of events extracted or not extracted for each user. We then computed Cohen s Kappa between these

5 lists of binary indicators for each pair of annotators and then averaged across pairs. The average inter-annotator agreement for this binary indicator was.94 Kappa. Then for all events each pair of annotators agreed was present for each user, we computed the amount of time between the extracted and resolved date from the post time of the user s first post. Similar to prior work evaluating agreement on temporal extraction, we then computed the Cronbach s alpha between these continuous scales, with one assignment per event, across annotators. Again, the agreement was quite high, with an alpha of.99. Thus, we concluded that the temporal dependency annotation paradigm was reliable. 5 Method An overview of our system is shown in Figure 2. Given an event and a user s post set, the system searches for all of the sentences that contain an event keyword (keyword sentence) and all the sentences that contain both a keyword and a temporal expression (date sentence). The TEs in the date sentences are resolved and then used as candidate dates for the event. This process is sometimes known as distant supervision (Mintz et al., 2009). When users are chatting on the forums, they frequently use misspelled or mistyped words, such as Feburuary and Febuary. Medical terms like metastasis and mastectomy are even more likely to be spelled wrong. Thus, as a pre-processing step, we use the spell checker in LanguageTool ( to correct spelling in each sentence. Our model contains two main components. The classifier component learns from the date sentences and determines how likely each candidate TE is to be the correct event date. The constraint component learns temporal constraints over event dates. This is done by learning to predict the temporal relation between an event and a keyword sentence s DCT. The two components work together to find a globally optimal event date. 5.1 Non-standard temporal expression retrieval We recognize four types of time expressions as illustrated in Table 1. We can see that it is crucial to resolve the underspecified and relative TEs correctly since they occur so frequently Temporal expression identification There are a number of open source temporal expression retrieval systems available, such as HeidelTime (Strotgen and Gertz, 2010) and SUTime (Chang and Manning, 2012). However, most of the systems are designed for newswire texts and cannot handle informal language in Classifier Component Date Sent. Posts Keyword Sent. Aggregation Event date Constraint Component Figure 2: System overview. online forums well. We design a rule-based date expression tagger that is built using regular expression patterns. In Section 6.1, we compare its performance with SUTime, which has out-performed other state-of-art systems on the TempEval-2 corpus (Chang and Manning, 2012). SUTime identifies and resolves a wide range of non-standard TE types such as Feb 07 (2/2007). The additional types of non-standard TE our approach handles are listed as follows. User-specific TEs: A User s age, cancer anniversary and survivorship can provide temporal information about the User s CEs. We obtain the birth date of users from their personal profile to resolve age date expressions such as at the age of 57. Anniversary and survivorship expressions usually indicate the number of years from diagnosis to DCT. For example, Today is my third breast cancer anniversary is equivalent to I m three years from my diagnosis. Non-whole numbers such as a year and half, two and a half years, 1/2 years, 1.5 months. Abbreviations of time units : Informally people use wk as the abbreviation of week, yr as year, mo(n) as month. Other non-standard TEs: including Feb/97 (2/1997), 03/2005 (3/2005), in 6/08 (6/2008), on 6/13 (June of the DCT year) and in 06 (2006), September 08(9/2008), mid-june Temporal expression normalization For underspecified or relative dates, DCT is used to resolve these dates to fully specified expressions if possible. As underspecified or relative dates are more prevalent than absolute times in our data(e.g., last month rather than on July 3rd, 2007 ), our temporal expression retrieval method supports the following two types of resolution. Fuzzy date: We have noticed that different temporal expression types are mentioned with different accuracies. Because users are often not precise

6 when talking about an event date, an underspecified TE activates a range of dates instead of a single date. When we resolve a date expression, we also define a time window according to the date expression type that captures the relevant range of dates. For example, (1)<6/10/2010> I was diagnosed two years ago. (2)<6/10/2010> I was diagnosed last summer. Her diagnosis date is actually 7/2008, not 6/10/2008. So in this case, only the year information is accurate. The time window is [3/2010, 9/2010]. In sentence (2), the true date should be one of June, July or August of The time window is [6/2010, 8/2010]. For the underspecified month mentions, we resolve the year information according to the DCT month, the mentioned month and the verb tense. If there is no verb in the sentence, the default tense is past tense. For example, <1/16/2011> I was diagnosed in May. As the verb was is in past tense, we can infer that the user refers to May of 2010, since May of 2011 is a date in the future. However, when the month is less than the reference month, the user is usually referring to a month in the same year as the DCT. For example, the resolved date is 5/2011 for the sentence: <7/16/2011> I was diagnosed in May. 5.2 Classifier component In the classifier component, we predict how likely a retrieved TE is the event date. We create training examples by computing the metarelation between each TE and its gold event date. We allow for underspecified matches. Thus, the metarelation is true if the gold date falls into the window of the TE (See Section 5.1.2). Otherwise, the TE would be assigned false, for example, if the resolved date is 9/15/2001 and the time window is [7/15/2001,11/15/2001]. If the gold date is 8/2001, we would assign the true metarelation. But if the gold date is 1/2001, 9/15/2001 would be assigned false. We train a MaxEnt classifier on each TE in a date sentence. Features for the classifier include many of those in (McClosky and Manning, 2012; Yoshikawa et al., 2009)(Table 2). We use Stanford Parser( to generate the syntactic features. The new features are the Event-Subject, Negative and Modality features. In online support groups, users not only tell stories about themselves, they also share other patients stories (As shown in Figure 1). So we add subject features to remove this kind of noise, which includes the governing subject of the event keyword and its POS tag. Modality features include the appearance of modals before the event keyword (e.g., may, might, etc.). Negative features include the presence/absence of negative words (e.g. no, not, never, etc.). These two features indicate a hypothetical or counter-factual expression of the event. These features are not useful in newswire texts where there is not great variation either in modality and polarity. As Chambers et al. (2007) has pointed out, the majority class baseline of modality and polarity in that context is 98%. Let DS u be the set of the user s date sentences, let D u be the set of dates resolved from each TE. We represent a MaxEnt classifier by P relation (R t, ds) for a candidate date t in date sentence ds and possible relation R = {true, f alse}. We map this distribution over relations to a distribution over dates by defining P DateSentence (t DS u ): P DateSentence (t DS u ) = 1 Z(D u) δ tj (t)p relation (true t j, ds j ) t j D u δ tj (t) = { 1 ift t j s window 0 otherwise We refer to this model as the Date Classifier model as it only utilize temporal information from date sentences. 5.3 Constraint component Previous work only retrieves time-rich sentences. In other words, the sentence should include the query, slot fillers, as well as some TEs (Ji et al., 2011; McClosky and Manning, 2012; Garrido et al., 2012). However, keyword sentences can inform temporal constraints for events and therefore should not be ignored. For example, Well, I m officially a rads grad! indicates the user has done radiation by the time of the post. Radiation is not a choice for me. This sentence indicates the user probably never had radiation. Before chemotherapy starts, the users tend to talk about choices of drug combinations. After chemotherapy, they tend to talk about side-effects such as increased weight or hair loss. This section departs from the above Date Classifier and instead predicts whether each keyword sentence is posted before or after the user s event date. The goal is to automatically learn time constraints for the event. We create training examples by computing the temporal relation between the DCT and the user s gold event date. If the user is not identified with an event date, the sentences should be labeled as unknown. We train a MaxEnt classifier on each pair of event mention and its corresponding DCT. Features are listed in Table 2. All the features used in the classifier component that are not related to the TEs are included.

7 Let KS u be the set of the user s keyword sentences, let D u be the set of dates resolved from each date sentence. We represent a MaxEnt classifier by P relation (R ks) for a keyword sentence ks and possible relation R = {bef ore, af ter, unknown}. DCT is the post time of the keyword sentence ks. The rel(dct, t) function simply determines if the DCT is before or after the candidate date t. We map this distribution over relations to a distribution over dates by defining P KeywordSentence (t, KS u ): P KeywordSentence (t KS u ) = 1 Z(D u) P relation (rel(dct j, t) ks j ) ks j KS u { before if dct < t rel(dct, t) = after if dct t Feature Classifier Constraint Component Component Event-keyword X X Event-gov-verb X X (tense) X X Event-Subject X X (POS) X X Date-type X Date-gov-verb X (tense) X Date-gov-prep X Unigram (POS) X X Bigram (POS) X X Dependency path X (Date-Keyword) Path length X POS-path X Word-path X Negative X X Modality X X Table 2: Features. Event-gov-verb denotes the verb that dominates the event keyword. Date-govprep denotes the preposition that dominates the temporal expression. 5.4 Date Aggregation Given the classifier component of Section 5.2 and the constraint classifiers of Section 5.3, we create a joint model combining the two with the following linear interpolation: P (t posts u ) = λp DateSentence (t DS u ) + (1 λ)p KeywordSentence (t KS u ) where t is a candidate event date. The system will output t that maximizes P (t posts u ) and unknown if DS u is empty. 6 Evaluation metric and results 6.1 Date expression retrieval We first compare our temporal tagger s performance with SUTime (Chang and Manning, 2012) on the 601 manually extracted TEs. Here we exclude the User-specific TEs. The evaluation consists of two parts: identifying the extents of a TE and then providing the correct resolved date for each recognized expression. Table 3 gives the experimental results. Our tagger has both higher precision and recall in both tasks. The improvement in recall is especially high because we handle the non-standard TEs described in Section 3.3. P R F1 Extents SUTime Our tagger Normalization SUTime Our tagger Table 3: Date expression retrieval results 6.2 Event dates Our corpus consists of 250 users for training, and 50 for test. Each user has approximately 1.7 extracted event dates on average. λ was set to 0.7 by maximizing accuracy in five-fold cross-validation on the training set Evaluation metric The extracted date is only considered correct if it completely matches the gold date. In several cases, we have multiple dates for the same event (e.g., a user had a mastectomy twice). Similar to the evaluation metric in a previous study (Ji et al., 2011), in these cases, we give the system the benefit of the doubt and the extracted date is considered correct if it matches one of the gold dates. In previous work (McClosky and Manning, 2012; Ji et al., 2011), the evaluation metric score is defined as 1/((1+ d )) where d is the difference between the values in years. We choose a much stricter evaluation metric because we need a precise event date to study user behavior changes Baselines and oracle We provide two baselines to describe heuristic methods of aggregating the hard decisions from the classifier learned in Section 5.3. The first baseline, Baseline1, is to pick the date with the highest classifier s prediction confidence. The second baseline, Baseline2, is along the same lines as the Combined Classifier used in (McClosky and Manning, 2012). For example, if the candidate date is 6/2009 and we have retrieved two temporal expressions that are resolved to 6/2009 and 4/2008, then P ( 6/2009 ) = P relation (true 6/2009 ) = true) P relation (false 4/2008 ). To determine the best possible performance given our temporal expression retrieval system, we calculate the oracle score by considering an extraction is correct if the gold date is one of the retrieved candidate dates. The oracle score can differ from a perfect score since we can only use candidate temporal expressions if (a) the relation

8 Baseline1 Baseline2 Date Joint Oracle CE count P R F1 P R F1 P R F1 P R F1 F1 Diagnosis Metastasis Recurrence Chemo-start Chemo-end Rad-start Rad-end Mastectomy Lumpectomy Reconstruction Table 4: Event-level five-fold cross-validation performance of models and baselines on training data. is known and (b) mentions of the event are retrievable (i.e. contain a event keyword), (c) the temporal expression appears in the same sentence, and (d) our temporal tagger is able to recognize and resolve it correctly. Nevertheless, it is a reasonable upper bound in our setting Results We present the performance of our models, baselines and the oracle in Table 5. We evaluate on all the events (ALL) and only on the events with known dates (GOLD). When the gold relations are provided, the precision and recall are equal so we only report F1 measure. The joint model performs best among the non-oracle systems. Both the Date Classifier and Joint model outperform the basic aggregation models. All improvements are statistically significant (p < , McNemar s test, 2-tailed). This shows the value of our approach to leveraging redundancy of event date mentions instead of incorporating all marginal probabilities. Incorporating time constraints further improves the F1. The joint model achieves 89.7% of the oracle result. ALL GOLD Model P R F1 F1 Baseline Baseline Date Classifier Joint Model Oracle Table 5: Performance of systems on the test set. Table 4 shows the performance of our systems and baselines on individual event types. The Joint Model derives most of its improvement from performance related to the Chemotherapy/Radiationstart date. This is mainly because Chemotherapy and Radiation last for a period of time and there are more event-related discussions containing the event keyword. None of our systems improves on cancer Metastasis and Recurrence. This is likely due to the sparsity of these events. 7 Error Analysis and Discussion Distant supervision assumption: In our distant supervision framework we assume that the DE should appear in the same sentence as a cancer keyword in order to become a candidate date. This is often not true. For example, the users can use a metaphor to refer to the event: My roller coaster ride began October 9th, From the context, we can infer that this user was diagnosed on October 9th, Deep temporal reasoning: Under-specified and Relative TEs often require deep temporal reasoning in order to resolve. For example, <5/20/2010>Dx with 7.5 cm tumor same side, stage IIb in Sept 08, mastectomy in Oct, and started AC in Nov, and Taxol in Jan 09. The TE in Oct should be resolved as 10/2008. But if we resolve it only based on DCT, the resolved date is 10/ Conclusion In this paper we presented a challenging new event date extraction task that requires extraction and resolution of non-standard date expression forms, namely extraction of personal illness event dates in the posting histories of online community participants. We have constructed an evaluation corpus and designed a temporal tagger for non-standard date expressions in social media. We introduced a joint framework to learn and apply temporal constraints on the date aggregation process. Both of our models (Date and Joint) obtain substantial error reductions over simpler consistency-based heuristic approaches. By creating an analogous keyword set, the overall framework can easily be applied to other information extraction tasks. A key aspect of future work will be to extend our model to other social media such as Facebook posts. Another line of study is to utilize the extracted timelines for user behavior analysis where qualitative analysis at a large scale is difficult or impossible.

9 References Javier Artiles, Qi Li, Taylor Cassidy, Suzanne Tamang, and Heng Ji CUNY BLENDER TACKBP2011 Temporal Slot Filling System Description. In Proceedings of Text Analysis Conference (TAC). J. R. Brubaker, F. Kivran-Swaine, L. Taber and G. R. Hayes Grief-Stricken in a Crowd: The language of bereavement and distress in social media. In Proc. ICWSM Nathanael Chambers, Shan Wang, and Dan Jurafsky Classifying temporal relations between events. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Angel X. Chang and Christopher D. Manning SUTIME: A library for recognizing and normalizing time expressions. In 8th International Conference onlanguage Resources and Evaluation (LREC 2012). De Choudhury, M., Counts, S., and Horvitz, E Major Life Changes and Behavioral Markers in Social Media: Case of Childbirth. In Proc. CSCW K. P. Davison, J. W. Pennebaker, and S. S. Dick-erson Who talks? the social psychology of illness support groups. American Psychologist, 55(2): Guillermo Garrido, Anselmo Penas, Bernardo Cabaleiro, and Alvaro Rodrigo Temporally Anchored Relation Extraction. In Proceedings of the 50th annual meeting of the as-sociation for computational linguistics. Heng Ji, Ralph Grishman, and Hoa Trang Dang Overview of the TAC 2011 Knowledge Base Population track. In Proceedings of Text Analysis Conference (TAC). Hyuckchul Jung, James Allen, Nate Blaylock, Will de Beaumont, Lucian Galescu, and Mary Swift Building timelines from narrative clinical records: initial results based-on deep natural language understanding. In Proceedings of BioNLP Oleksandr Kolomiyets, Steven Bethard and Marie- Francine Moens Model-Portability Experiments for Textual Temporal Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Oleksandr Kolomiyets, Steven Bethard, and Marie- Francine Moens Extracting narrative timelines as temporal dependency structures. In Proceedings of the 50th annual meeting of the Association for Computational Linguistics. David McClosky and Christopher D. Manning Learning Constraints for Consistent Timeline Extraction. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP2012). Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky Distant supervision for relation extraction without labeled data. In Proceedings of the 47th annual meeting of the Association for Computational Linguistics. Heekyong Park and Jinwook Choi V-model: a new innovative model to chronologically visualize narrative clinical texts. In Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems. ACM. Catherine Plaisant, Brett Milash, Anne Rose, Seth Widoff, and Ben Shneiderman LifeLines: visualizing personal histories. In Proceedings of the SIGCHI conference on Human factors in computing systems. James Pustejovsky, Jos M. Castao, Robert Ingria, Roser Sauri, Robert J. Gaizauskas, Andrea Setzer, Graham Katz, and Dragomir R. Radev TimeML: Robust specification of event and temporal expressions in text. TimeML: Robust specification of event and temporal expressions in text. In New Directions in Question Answering 03. James Pustejovsky, Patrick Hanks, Roser Sauri, Andrew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev et al The Timebank corpus. In Corpus Linguistics. Preethi Raghavan, Eric Fosler-Lussier, and Albert M. Lai Learning to Temporally Order Medical Events in Clinical Text. In Proceedings of the 50th annual meeting of the Association for computational Linguistics. Jannik Strotgen and Michael Gertz Heidel- Time:High Quality Rule-Based Extraction and Normalizationof Temporal Expressions. In SemEval 10. Jannik Strotgen and Michael Gertz Temporal Tagging on Different Domains: Chal-lenges, Strategies, and Gold Standards. In LREC2012. Partha Pratim Talukdar, Derry Wijaya, and Tom Mitchell Coupled temporal scoping of relational facts. In Proceedings of the fifth ACM international conference on Web search and data mining. ACM. Naushad UzZaman and James F. Allen TRIPS and TRIOS system for TempEval-2: Extracting temporal information from text. In Proceedings of the 5th International Workshop on Semantic Evaluation. Yafang Wang, Bing Yang, Lizhen Qu, Marc Spaniol, and GerhardWeikum Harvesting facts from textual web sources by constrained label propagation. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management.

10 K.-Y. Wen, F. McTavish, G. Kreps, M. Wise, and D. Gustafson From diagnosis to death: A case study of coping with breast cancer as seen through online discussion group messages. Journal of Computer-Mediated Communication, 16: Katsumasa Yoshikawa, Sebastian Riedel, Masayuki Asahara, and Yuji Matsumoto Jointly identifying temporal relations with markov logic. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Li Zhou and George Hripcsak Temporal reasoning with medical data a review with emphasis on medical natural language processing. Journal of biomedical informatics 40.2 (2007): 183.

Carolyn Penstein Rosé

Carolyn Penstein Rosé Carolyn Penstein Rosé Language Technologies Institute Human-Computer Interaction Institute School of Computer Science With funding from the National Science Foundation and the Office of Naval Research

More information

WikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions

WikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions WikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions Jannik Strötgen, Michael Gertz Institute of Computer Science, Heidelberg University Im Neuenheimer Feld 348, 69120 Heidelberg,

More information

Relabeling Distantly Supervised Training Data for Temporal Knowledge Base Population

Relabeling Distantly Supervised Training Data for Temporal Knowledge Base Population Relabeling Distantly Supervised Training Data for Temporal Knowledge Base Population Suzanne Tamang and Heng Ji Computer Science Department and Linguistics Department Graduate Center and Queens College,

More information

SemEval-2015 Task 6: Clinical TempEval

SemEval-2015 Task 6: Clinical TempEval SemEval-2015 Task 6: Clinical TempEval Steven Bethard University of Alabama at Birmingham Birmingham, AL 35294, USA bethard@cis.uab.edu Guergana Savova Harvard Medical School Boston, MA 02115, USA Guergana.Savova@

More information

Annotating Temporal Relations to Determine the Onset of Psychosis Symptoms

Annotating Temporal Relations to Determine the Onset of Psychosis Symptoms Annotating Temporal Relations to Determine the Onset of Psychosis Symptoms Natalia Viani, PhD IoPPN, King s College London Introduction: clinical use-case For patients with schizophrenia, longer durations

More information

Prior-informed Distant Supervision for Temporal Evidence Classification

Prior-informed Distant Supervision for Temporal Evidence Classification Prior-informed Distant Supervision for Temporal Evidence Classification Ridho Reinanda University of Amsterdam Amsterdam, The Netherlands r.reinanda@uva.nl Maarten de Rijke University of Amsterdam Amsterdam,

More information

A Comparison of Weak Supervision methods for Knowledge Base Construction

A Comparison of Weak Supervision methods for Knowledge Base Construction A Comparison of Weak Supervision methods for Knowledge Base Construction Ameet Soni Department of Computer Science Swarthmore College soni@cs.swarthmore.edu Dileep Viswanathan School of Informatics and

More information

Inferring Temporal Knowledge for Near-Periodic Recurrent Events

Inferring Temporal Knowledge for Near-Periodic Recurrent Events Inferring Temporal Knowledge for Near-Periodic Recurrent Events Dinesh Raghu 1 2 4, Surag Nair 3 5, Mausam 1 1 IIT Delhi, New Delhi, India 2 IBM Research, New Delhi, India 3 Stanford University, Stanford,

More information

Building Evaluation Scales for NLP using Item Response Theory

Building Evaluation Scales for NLP using Item Response Theory Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged

More information

Annotating If Authors of Tweets are Located in the Locations They Tweet About

Annotating If Authors of Tweets are Located in the Locations They Tweet About Annotating If Authors of Tweets are Located in the Locations They Tweet About Vivek Doudagiri, Alakananda Vempala and Eduardo Blanco Human Intelligence and Language Technologies Lab University of North

More information

HiEve: A Corpus for Extracting Event Hierarchies from News Stories

HiEve: A Corpus for Extracting Event Hierarchies from News Stories HiEve: A Corpus for Extracting Event Hierarchies from News Stories Goran Glavaš 1, Jan Šnajder 1, Parisa Kordjamshidi 2, and Marie-Francine Moens 2 1 University of Zagreb, Faculty of Electrical Engineering

More information

Representation and Learning of Temporal Relations

Representation and Learning of Temporal Relations Representation and Learning of Temporal Relations Leon Derczynski Department of Computer Science University of Sheffield S1 4DP, UK leon.d@shef.ac.uk Abstract Determining the relative order of events and

More information

Asthma Surveillance Using Social Media Data

Asthma Surveillance Using Social Media Data Asthma Surveillance Using Social Media Data Wenli Zhang 1, Sudha Ram 1, Mark Burkart 2, Max Williams 2, and Yolande Pengetnze 2 University of Arizona 1, PCCI-Parkland Center for Clinical Innovation 2 {wenlizhang,

More information

A Corpus of Clinical Narratives Annotated with Temporal Information

A Corpus of Clinical Narratives Annotated with Temporal Information A Corpus of Clinical Narratives Annotated with Temporal Information Lucian Galescu Nate Blaylock Florida Institute for Human and Machine Cognition (IHMC) Pensacola, Florida, USA {lgalescu, blaylock}@ihmc.us

More information

RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses

RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses Sean MacAvaney*, Bart Desmet *, Arman Cohan*, Luca Soldaini*, Andrew Yates *, Ayah Zirikly, Nazli Goharian* *IR Lab, Georgetown University,

More information

Sentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions?

Sentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions? Sentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions? Isa Maks and Piek Vossen Vu University, Faculty of Arts De Boelelaan 1105, 1081 HV Amsterdam e.maks@vu.nl, p.vossen@vu.nl

More information

Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports

Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports Ramon Maldonado, BS, Travis Goodwin, PhD Sanda M. Harabagiu, PhD The University

More information

Exploring Semi-Supervised Coreference Resolution of Medical Concepts using Semantic and Temporal Features

Exploring Semi-Supervised Coreference Resolution of Medical Concepts using Semantic and Temporal Features Exploring Semi-Supervised Coreference Resolution of Medical Concepts using Semantic and Temporal Features Preethi Raghavan, Eric Fosler-Lussier, and Albert M. Lai Department of Computer Science and Engineering

More information

Chapter 12 Conclusions and Outlook

Chapter 12 Conclusions and Outlook Chapter 12 Conclusions and Outlook In this book research in clinical text mining from the early days in 1970 up to now (2017) has been compiled. This book provided information on paper based patient record

More information

Learning the Fine-Grained Information Status of Discourse Entities

Learning the Fine-Grained Information Status of Discourse Entities Learning the Fine-Grained Information Status of Discourse Entities Altaf Rahman and Vincent Ng Human Language Technology Research Institute The University of Texas at Dallas Plan for the talk What is Information

More information

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing Chapter IR:VIII VIII. Evaluation Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing IR:VIII-1 Evaluation HAGEN/POTTHAST/STEIN 2018 Retrieval Tasks Ad hoc retrieval:

More information

A Language Modeling Approach for Temporal Information Needs

A Language Modeling Approach for Temporal Information Needs A Language Modeling Approach for Temporal Information Needs Klaus Berberich, Srikanta Bedathur Omar Alonso, Gerhard Weikum 2010, March 29th ECIR 2010 Milton Keynes Motivation Information needs having a

More information

This is the author s version of a work that was submitted/accepted for publication in the following source:

This is the author s version of a work that was submitted/accepted for publication in the following source: This is the author s version of a work that was submitted/accepted for publication in the following source: Moshfeghi, Yashar, Zuccon, Guido, & Jose, Joemon M. (2011) Using emotion to diversify document

More information

Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments

Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations December 2014 Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments Eric James Klosterman University

More information

Evaluating Quality in Creative Systems. Graeme Ritchie University of Aberdeen

Evaluating Quality in Creative Systems. Graeme Ritchie University of Aberdeen Evaluating Quality in Creative Systems Graeme Ritchie University of Aberdeen Graeme Ritchie {2007} Some Empirical Criteria for Attributing Creativity to a Computer Program. Minds and Machines 17 {1}, pp.67-99.

More information

NYU Refining Event Extraction (Old-Fashion Traditional IE) Through Cross-document Inference

NYU Refining Event Extraction (Old-Fashion Traditional IE) Through Cross-document Inference NYU Refining Event Extraction (Old-Fashion Traditional IE) Through Cross-document Inference Heng Ji and Ralph Grishman (hengji,grishman)@cs.nyu.edu Computer Science Department New York University June,

More information

Extraction of Adverse Drug Effects from Clinical Records

Extraction of Adverse Drug Effects from Clinical Records MEDINFO 2010 C. Safran et al. (Eds.) IOS Press, 2010 2010 IMIA and SAHIA. All rights reserved. doi:10.3233/978-1-60750-588-4-739 739 Extraction of Adverse Drug Effects from Clinical Records Eiji Aramaki

More information

Text mining for lung cancer cases over large patient admission data. David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor

Text mining for lung cancer cases over large patient admission data. David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor Text mining for lung cancer cases over large patient admission data David Martinez, Lawrence Cavedon, Zaf Alam, Christopher Bain, Karin Verspoor Opportunities for Biomedical Informatics Increasing roll-out

More information

Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk

Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk Michael Denkowski and Alon Lavie Language Technologies Institute School of

More information

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011 Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.1 7/27/2011 I. Purpose Drawing from the profile development of the QIBA-fMRI Technical Committee,

More information

Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework

Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework Thomas E. Rothenfluh 1, Karl Bögl 2, and Klaus-Peter Adlassnig 2 1 Department of Psychology University of Zurich, Zürichbergstraße

More information

Foundations of AI. 10. Knowledge Representation: Modeling with Logic. Concepts, Actions, Time, & All the Rest

Foundations of AI. 10. Knowledge Representation: Modeling with Logic. Concepts, Actions, Time, & All the Rest Foundations of AI 10. Knowledge Representation: Modeling with Logic Concepts, Actions, Time, & All the Rest Wolfram Burgard, Andreas Karwath, Bernhard Nebel, and Martin Riedmiller 10/1 Contents Knowledge

More information

SemEval-2017 Task 12: Clinical TempEval

SemEval-2017 Task 12: Clinical TempEval SemEval-2017 Task 12: Clinical TempEval Steven Bethard University of Arizona Tucson, AZ 85721, USA bethard@email.arizona.edu Guergana Savova Harvard Medical School Boston, MA 02115, USA Guergana.Savova@childrens.harvard.edu

More information

Running Head: AUTOMATED SCORING OF CONSTRUCTED RESPONSE ITEMS. Contract grant sponsor: National Science Foundation; Contract grant number:

Running Head: AUTOMATED SCORING OF CONSTRUCTED RESPONSE ITEMS. Contract grant sponsor: National Science Foundation; Contract grant number: Running Head: AUTOMATED SCORING OF CONSTRUCTED RESPONSE ITEMS Rutstein, D. W., Niekrasz, J., & Snow, E. (2016, April). Automated scoring of constructed response items measuring computational thinking.

More information

Animal Disease Event Recognition and Classification

Animal Disease Event Recognition and Classification 19th World Wide Web Conference WWW-2010 26-30 April 2010: Raleigh Conference Center, Raleigh, NC, USA Animal Disease Event Recognition and Classification Svitlana Volkova, Doina Caragea, William H. Hsu,

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials Riccardo Miotto and Chunhua Weng Department of Biomedical Informatics Columbia University,

More information

What Happened to Bob? Semantic Data Mining of Context Histories

What Happened to Bob? Semantic Data Mining of Context Histories What Happened to Bob? Semantic Data Mining of Context Histories Michael Wessel, Marko Luther, Ralf Möller Racer Systems DOCOMO Euro-Labs Uni Hamburg A mobile community service 1300+ users in 60+ countries

More information

Identifying Signs of Depression on Twitter Eugene Tang, Class of 2016 Dobin Prize Submission

Identifying Signs of Depression on Twitter Eugene Tang, Class of 2016 Dobin Prize Submission Identifying Signs of Depression on Twitter Eugene Tang, Class of 2016 Dobin Prize Submission INTRODUCTION Depression is estimated to affect 350 million people worldwide (WHO, 2015). Characterized by feelings

More information

Modeling Annotator Rationales with Application to Pneumonia Classification

Modeling Annotator Rationales with Application to Pneumonia Classification Modeling Annotator Rationales with Application to Pneumonia Classification Michael Tepper 1, Heather L. Evans 3, Fei Xia 1,2, Meliha Yetisgen-Yildiz 2,1 1 Department of Linguistics, 2 Biomedical and Health

More information

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U T H E U N I V E R S I T Y O F T E X A S A T D A L L A S H U M A N L A N

More information

Factuality Levels of Diagnoses in Swedish Clinical Text

Factuality Levels of Diagnoses in Swedish Clinical Text User Centred Networked Health Care A. Moen et al. (Eds.) IOS Press, 2011 2011 European Federation for Medical Informatics. All rights reserved. doi:10.3233/978-1-60750-806-9-559 559 Factuality Levels of

More information

The SAGE Deaf Studies Encyclopedia Geographies

The SAGE Deaf Studies Encyclopedia Geographies The SAGE Deaf Studies Encyclopedia Geographies Contributors: Mike Gulliver & Mary Beth Kitzel Edited by: Genie Gertz & Patrick Boudreault Book Title: Chapter Title: "Geographies" Pub. Date: 2016 Access

More information

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Ryo Izawa, Naoki Motohashi, and Tomohiro Takagi Department of Computer Science Meiji University 1-1-1 Higashimita,

More information

Modeling the Use of Space for Pointing in American Sign Language Animation

Modeling the Use of Space for Pointing in American Sign Language Animation Modeling the Use of Space for Pointing in American Sign Language Animation Jigar Gohel, Sedeeq Al-khazraji, Matt Huenerfauth Rochester Institute of Technology, Golisano College of Computing and Information

More information

An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation

An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation An Intelligent Writing Assistant Module for Narrative Clinical Records based on Named Entity Recognition and Similarity Computation 1,2,3 EMR and Intelligent Expert System Engineering Research Center of

More information

Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems

Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems Danchen Zhang 1, Daqing He 1, Sanqiang Zhao 1, Lei Li 1 School of Information Sciences, University of Pittsburgh, USA

More information

Textual Emotion Processing From Event Analysis

Textual Emotion Processing From Event Analysis Textual Emotion Processing From Event Analysis Chu-Ren Huang, Ying Chen *, Sophia Yat Mei Lee Department of Chinese and Bilingual Studies * Department of Computer Engineering The Hong Kong Polytechnic

More information

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts

Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Abstracts jsci2016 Semi-Automatic Construction of Thyroid Cancer Intervention Corpus from Biomedical Wutthipong Kongburan, Praisan Padungweang, Worarat Krathu, Jonathan H. Chan School of Information Technology King

More information

Automatic Context-Aware Image Captioning

Automatic Context-Aware Image Captioning Technical Disclosure Commons Defensive Publications Series May 23, 2017 Automatic Context-Aware Image Captioning Sandro Feuz Sebastian Millius Follow this and additional works at: http://www.tdcommons.org/dpubs_series

More information

Query Refinement: Negation Detection and Proximity Learning Georgetown at TREC 2014 Clinical Decision Support Track

Query Refinement: Negation Detection and Proximity Learning Georgetown at TREC 2014 Clinical Decision Support Track Query Refinement: Negation Detection and Proximity Learning Georgetown at TREC 2014 Clinical Decision Support Track Christopher Wing and Hui Yang Department of Computer Science, Georgetown University,

More information

Rumor Detection on Twitter with Tree-structured Recursive Neural Networks

Rumor Detection on Twitter with Tree-structured Recursive Neural Networks 1 Rumor Detection on Twitter with Tree-structured Recursive Neural Networks Jing Ma 1, Wei Gao 2, Kam-Fai Wong 1,3 1 The Chinese University of Hong Kong 2 Victoria University of Wellington, New Zealand

More information

OVERVIEW TUTORIAL BEHAVIORAL METHODS CLAIM: EMLAR VII EYE TRACKING: READING. Lecture (50 min) Short break (10 min) Computer Assignments (30 min)

OVERVIEW TUTORIAL BEHAVIORAL METHODS CLAIM: EMLAR VII EYE TRACKING: READING. Lecture (50 min) Short break (10 min) Computer Assignments (30 min) EMLAR VII EYE TRACKING: READING Arnout Koornneef a.w.koornneef@uu.nl OVERVIEW TUTORIAL Lecture (50 min) Basic facts about reading Examples Advantages and disadvantages of eye tracking Short break (10 min)

More information

Predicting Breast Cancer Survivability Rates

Predicting Breast Cancer Survivability Rates Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer

More information

Identifying Adverse Drug Events from Patient Social Media: A Case Study for Diabetes

Identifying Adverse Drug Events from Patient Social Media: A Case Study for Diabetes Identifying Adverse Drug Events from Patient Social Media: A Case Study for Diabetes Authors: Xiao Liu, Department of Management Information Systems, University of Arizona Hsinchun Chen, Department of

More information

Model calibration and Bayesian methods for probabilistic projections

Model calibration and Bayesian methods for probabilistic projections ETH Zurich Reto Knutti Model calibration and Bayesian methods for probabilistic projections Reto Knutti, IAC ETH Toy model Model: obs = linear trend + noise(variance, spectrum) 1) Short term predictability,

More information

SemEval-2016 Task 12: Clinical TempEval

SemEval-2016 Task 12: Clinical TempEval SemEval-2016 Task 12: Clinical TempEval Steven Bethard University of Alabama at Birmingham Birmingham, AL 35294, USA bethard@cis.uab.edu Guergana Savova Harvard Medical School Boston, MA 02115, USA Guergana.Savova@childrens.harvard.edu

More information

Understanding Participant Behavior Trajectories in Online Health Support Groups Using Automatic Extraction Methods

Understanding Participant Behavior Trajectories in Online Health Support Groups Using Automatic Extraction Methods Understanding Participant Behavior Trajectories in Online Health Support Groups Using Automatic Extraction Methods ABSTRACT Miaomiao Wen Language Technologies Institute Carnegie Mellon University 5000

More information

Analyzing Metaphoric Source Domain Language: A MetaNet Approach

Analyzing Metaphoric Source Domain Language: A MetaNet Approach Analyzing Metaphoric Source Domain Language: A MetaNet Approach Ellen Dodge and Elise Stickles International Computer Science Institute & Stanford University edodge@icsi.berkeley.edu stickles@stanford.edu

More information

Shiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p )

Shiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p ) Rasch Measurementt iin Language Educattiion Partt 2:: Measurementt Scalles and Invariiance by James Sick, Ed.D. (J. F. Oberlin University, Tokyo) Part 1 of this series presented an overview of Rasch measurement

More information

Learning from Online Health Communities. Noémie Elhadad

Learning from Online Health Communities. Noémie Elhadad Learning from Online Health Communities Noémie Elhadad noemie@dbmi.columbia.edu Apps/tools for health consumers & patients iphone health apps A (not social) tracking tool xkcd.com Online Health Communities

More information

TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING

TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING 134 TWO HANDED SIGN LANGUAGE RECOGNITION SYSTEM USING IMAGE PROCESSING H.F.S.M.Fonseka 1, J.T.Jonathan 2, P.Sabeshan 3 and M.B.Dissanayaka 4 1 Department of Electrical And Electronic Engineering, Faculty

More information

Generalizing Dependency Features for Opinion Mining

Generalizing Dependency Features for Opinion Mining Generalizing Dependency Features for Mahesh Joshi 1 and Carolyn Rosé 1,2 1 Language Technologies Institute 2 Human-Computer Interaction Institute Carnegie Mellon University ACL-IJCNLP 2009 Short Papers

More information

Chapter 11. Experimental Design: One-Way Independent Samples Design

Chapter 11. Experimental Design: One-Way Independent Samples Design 11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing

More information

Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods

Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods D. Weissenbacher 1, A. Sarker 2, T. Tahsin 1, G. Gonzalez 2 and M. Scotch 1

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

English and Persian Apposition Markers in Written Discourse: A Case of Iranian EFL learners

English and Persian Apposition Markers in Written Discourse: A Case of Iranian EFL learners 7 English and Persian Apposition Markers in Written Discourse: A Case of Iranian EFL learners Samaneh Chamanaraeian M.A. Student in Islamic Azad University (Isfahan Branch) samanechaman@yahoo.com and (corresponding

More information

May All Your Wishes Come True: A Study of Wishes and How to Recognize Them

May All Your Wishes Come True: A Study of Wishes and How to Recognize Them May All Your Wishes Come True: A Study of Wishes and How to Recognize Them Andrew B. Goldberg, Nathanael Fillmore, David Andrzejewski, Zhiting Xu, Bryan Gibson & Xiaojin Zhu Computer Sciences Department

More information

Knowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains

Knowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains Knowledge networks of biological and medical data An exhaustive and flexible solution to model life sciences domains Dr. Sascha Losko, Dr. Karsten Wenger, Dr. Wenzel Kalus, Dr. Andrea Ramge, Dr. Jens Wiehler,

More information

Temporal Analysis of Online Social Graph by Home Location

Temporal Analysis of Online Social Graph by Home Location Temporal Analysis of Online Social Graph by Home Location Shiori Hironaka Mitsuo Yoshida Kyoji Umemura Toyohashi University of Technology Aichi, Japan s143369@edu.tut.ac.jp, yoshida@cs.tut.ac.jp, umemura@tut.jp

More information

Combining unsupervised and supervised methods for PP attachment disambiguation

Combining unsupervised and supervised methods for PP attachment disambiguation Combining unsupervised and supervised methods for PP attachment disambiguation Martin Volk University of Zurich Schönberggasse 9 CH-8001 Zurich vlk@zhwin.ch Abstract Statistical methods for PP attachment

More information

A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1

A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1 A Study of Abbreviations in Clinical Notes Hua Xu MS, MA 1, Peter D. Stetson, MD, MA 1, 2, Carol Friedman Ph.D. 1 1 Department of Biomedical Informatics, Columbia University, New York, NY, USA 2 Department

More information

Not all NLP is Created Equal:

Not all NLP is Created Equal: Not all NLP is Created Equal: CAC Technology Underpinnings that Drive Accuracy, Experience and Overall Revenue Performance Page 1 Performance Perspectives Health care financial leaders and health information

More information

Discovering Identity Problems: A Case Study

Discovering Identity Problems: A Case Study Discovering Identity Problems: A Case Study G. Alan Wang 1, Homa Atabakhsh 1, Tim Petersen 2, Hsinchun Chen 1 1 Department of Management Information Systems, University of Arizona, Tucson, AZ 85721 {gang,

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

Some Thoughts on the Principle of Revealed Preference 1

Some Thoughts on the Principle of Revealed Preference 1 Some Thoughts on the Principle of Revealed Preference 1 Ariel Rubinstein School of Economics, Tel Aviv University and Department of Economics, New York University and Yuval Salant Graduate School of Business,

More information

Practitioner s Guide To Stratified Random Sampling: Part 1

Practitioner s Guide To Stratified Random Sampling: Part 1 Practitioner s Guide To Stratified Random Sampling: Part 1 By Brian Kriegler November 30, 2018, 3:53 PM EST This is the first of two articles on stratified random sampling. In the first article, I discuss

More information

Deep Learning based Information Extraction Framework on Chinese Electronic Health Records

Deep Learning based Information Extraction Framework on Chinese Electronic Health Records Deep Learning based Information Extraction Framework on Chinese Electronic Health Records Bing Tian Yong Zhang Kaixin Liu Chunxiao Xing RIIT, Beijing National Research Center for Information Science and

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to

More information

A Computational Theory of Belief Introspection

A Computational Theory of Belief Introspection A Computational Theory of Belief Introspection Kurt Konolige Artificial Intelligence Center SRI International Menlo Park, California 94025 Abstract Introspection is a general term covering the ability

More information

DTI C TheUniversityof Georgia

DTI C TheUniversityof Georgia DTI C TheUniversityof Georgia i E L. EGCTE Franklin College of Arts and Sciences OEC 3 1 1992 1 Department of Psychology U December 22, 1992 - A Susan E. Chipman Office of Naval Research 800 North Quincy

More information

Cross-cultural Deception Detection

Cross-cultural Deception Detection Cross-cultural Deception Detection Verónica Pérez-Rosas Computer Science and Engineering University of North Texas veronicaperezrosas@my.unt.edu Rada Mihalcea Computer Science and Engineering University

More information

Lecture 10: POS Tagging Review. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han

Lecture 10: POS Tagging Review. LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Lecture 10: POS Tagging Review LING 1330/2330: Introduction to Computational Linguistics Na-Rae Han Overview Part-of-speech tagging Language and Computers, Ch. 3.4 Tokenization, POS tagging NLTK Book Ch.5

More information

Using Statistical Intervals to Assess System Performance Best Practice

Using Statistical Intervals to Assess System Performance Best Practice Using Statistical Intervals to Assess System Performance Best Practice Authored by: Francisco Ortiz, PhD STAT COE Lenny Truett, PhD STAT COE 17 April 2015 The goal of the STAT T&E COE is to assist in developing

More information

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS

Semantic Alignment between ICD-11 and SNOMED-CT. By Marcie Wright RHIA, CHDA, CCS Semantic Alignment between ICD-11 and SNOMED-CT By Marcie Wright RHIA, CHDA, CCS World Health Organization (WHO) owns and publishes the International Classification of Diseases (ICD) WHO was entrusted

More information

Multi-modal Patient Cohort Identification from EEG Report and Signal Data

Multi-modal Patient Cohort Identification from EEG Report and Signal Data Multi-modal Patient Cohort Identification from EEG Report and Signal Data Travis R. Goodwin and Sanda M. Harabagiu The University of Texas at Dallas Human Language Technology Research Institute http://www.hlt.utdallas.edu

More information

Disaster Analysis using User-Generated Weather Report

Disaster Analysis using User-Generated Weather Report Disaster Analysis using User-Generated Weather Report Yasunobu Asakura Tokyo Metropolitan University asakura-yasunobu@ed.tmu.ac.jp Masatsugu Hangyo Weathernews Inc. hangyo@wni.com Mamoru Komachi Tokyo

More information

Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion

Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion Yue Liu 1, Tao Ge 2, Kusum Mathews 3, Heng Ji 1, Deborah L. McGuinness 1 1 Department of Computer Science,

More information

Measuring Focused Attention Using Fixation Inner-Density

Measuring Focused Attention Using Fixation Inner-Density Measuring Focused Attention Using Fixation Inner-Density Wen Liu, Mina Shojaeizadeh, Soussan Djamasbi, Andrew C. Trapp User Experience & Decision Making Research Laboratory, Worcester Polytechnic Institute

More information

Analysis A step in the research process that involves describing and then making inferences based on a set of data.

Analysis A step in the research process that involves describing and then making inferences based on a set of data. 1 Appendix 1:. Definitions of important terms. Additionality The difference between the value of an outcome after the implementation of a policy, and its value in a counterfactual scenario in which the

More information

DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS

DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS SEOUL Oct.7, 2016 DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS Gunhee Kim Computer Science and Engineering Seoul National University October

More information

Measuring impact. William Parienté UC Louvain J PAL Europe. povertyactionlab.org

Measuring impact. William Parienté UC Louvain J PAL Europe. povertyactionlab.org Measuring impact William Parienté UC Louvain J PAL Europe povertyactionlab.org Course overview 1. What is evaluation? 2. Measuring impact 3. Why randomize? 4. How to randomize 5. Sampling and Sample Size

More information

Certificate Courses in Biostatistics

Certificate Courses in Biostatistics Certificate Courses in Biostatistics Term I : September December 2015 Term II : Term III : January March 2016 April June 2016 Course Code Module Unit Term BIOS5001 Introduction to Biostatistics 3 I BIOS5005

More information

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.2 8/10/2011 I. Purpose

Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.2 8/10/2011 I. Purpose Incorporation of Imaging-Based Functional Assessment Procedures into the DICOM Standard Draft version 0.2 8/10/2011 I. Purpose Drawing from the profile development of the QIBA-fMRI Technical Committee,

More information

Spatial Prepositions in Context: The Semantics of near in the Presence of Distractor Objects

Spatial Prepositions in Context: The Semantics of near in the Presence of Distractor Objects Spatial Prepositions in Context: The Semantics of near in the Presence of Distractor Objects Fintan J. Costello, School of Computer Science and Informatics, University College Dublin, Dublin, Ireland.

More information

An Annotated Corpus of Typical Durations of Events

An Annotated Corpus of Typical Durations of Events An Annotated Corpus of Typical Durations of Events Feng Pan, Rutu Mulkar, and Jerry R. Hobbs Information Sciences Institute (ISI), University of Southern California 4676 Admiralty Way, Marina del Rey,

More information

Checklist of Key Considerations for Development of Program Logic Models [author name removed for anonymity during review] April 2018

Checklist of Key Considerations for Development of Program Logic Models [author name removed for anonymity during review] April 2018 Checklist of Key Considerations for Development of Program Logic Models [author name removed for anonymity during review] April 2018 A logic model is a graphic representation of a program that depicts

More information

Insights into Analogy Completion from the Biomedical Domain

Insights into Analogy Completion from the Biomedical Domain Insights into Analogy Completion from the Biomedical Domain Denis Newman-Griffis, Albert Lai, Eric Fosler-Lussier The Ohio State University National Institutes of Health, Clinical Center Washington University

More information