A Language Modeling Approach for Temporal Information Needs

Similar documents
Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Logging Effectiveness Measures Efficiency Measures Training and Testing

This is the author s version of a work that was submitted/accepted for publication in the following source:

Building Evaluation Scales for NLP using Item Response Theory

Multi-modal Patient Cohort Identification from EEG Report and Signal Data

WikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions

What Predicts Media Coverage of Health Science Articles?

Mining Human-Place Interaction Patterns from Location-Based Social Networks to Enrich Place Categorization Systems

Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk

Evaluating Quality in Creative Systems. Graeme Ritchie University of Aberdeen

When Twitter meets Foursquare: Tweet Location Prediction using Foursquare

Microblog Retrieval for Disaster Relief: How To Create Ground Truths? Ribhav Soni and Sukomal Pal

Sentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions?

CS343: Artificial Intelligence

Lucy Ball. Michael Jordan. Ludwig Beethoven. Walt Disney. Albert Einstein. Babe Ruth. Abraham Lincoln. Steven Spielberg

Temporal Knowledge Representation for Scheduling Tasks in Clinical Trial Protocols

1. Analyse interpersonal communication in terms of satisfaction of needs. 2. Analyse the perception of advertising in relation to motivational factors

Large Scale Analysis of Health Communications on the Social Web. Michael J. Paul Johns Hopkins University

Models of Information Retrieval

Probability and Sample space

DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS

Crowdsourcing & Human Computation. Matt Lease School of Information University of Texas at Austin

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation

Section One: Alexithymia

Not All Moods are Created Equal! Exploring Human Emotional States in Social Media

Guidelines for Effective Usage of Text Highlighting Techniques

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

Using Information From the Target Language to Improve Crosslingual Text Classification

From Once Upon a Time to Happily Ever After: Tracking Emotions in Books and Mail! Saif Mohammad! National Research Council Canada!

High-level Vision. Bernd Neumann Slides for the course in WS 2004/05. Faculty of Informatics Hamburg University Germany

central sulcus parietal lobe frontal lobe occipital lobe Sylvian fissure temporal lobe

Qualitative research in the field of Information Literacy in the second decade of the XXI century

Cognitive Modeling. Lecture 12: Bayesian Inference. Sharon Goldwater. School of Informatics University of Edinburgh

Summarising Unreliable Data

Relationships Between the High Impact Indicators and Other Indicators

Between-word regressions as part of rational reading

NYU Refining Event Extraction (Old-Fashion Traditional IE) Through Cross-document Inference

Expert System Profile

Efficient AUC Optimization for Information Ranking Applications

Answers to end of chapter questions

A Brief Introduction to Bayesian Statistics

Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework

Assignment 4: True or Quasi-Experiment

May All Your Wishes Come True: A Study of Wishes and How to Recognize Them

Common Sense Assistant for Writing Stories that Teach Social Skills

Building Cognitive Computing for Healthcare

Emotions as Evaluative Feelings. Bennett Helm (2009) Slides by Jeremiah Tillman

Social Studies: Psychology Pacing Guide Quarter 3

Christopher Cairns and Elizabeth Plantan. October 9, 2016

Wikipedia-Based Automatic Diagnosis Prediction in Clinical Decision Support Systems

Data Through Others Eyes: The Impact of Visualizing Others Expectations on Visualization Interpretation

A Belief-Based Account of Decision under Uncertainty. Craig R. Fox, Amos Tversky

2017 English. Literary Study. Advanced Higher. Finalised Marking Instructions

Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression

Formulating Emotion Perception as a Probabilistic Model with Application to Categorical Emotion Classification

Hacettepe University Department of Computer Science & Engineering

A History Of The World Cup: By Clemente A. Lisi READ ONLINE

Social Network Analysis: When Social Relationship is the Dependent Variable. Anabel Quan Haase Faculty of Information and Media Studies Sociology

HOW TO IDENTIFY A RESEARCH QUESTION? How to Extract a Question from a Topic that Interests You?

Why Human-Centered Design Matters

Background Information

A Unified Approach to Ranking in Probabilistic Databases. Jian Li, Barna Saha, Amol Deshpande University of Maryland, College Park, USA

References. Christos A. Ioannou 2/37

Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain

Do Future Work sections have a purpose?

Labas! (don t worry in Lithuania polar bears are rather friendly)

Visualizing the Affective Structure of a Text Document

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

On the Collaborative Development of Para-consistent Conceptual Models

Supplementary notes for lecture 8: Computational modeling of cognitive development

The Mirror Dilemma. UROP Report. Lucie Biros

Understanding and Predicting Importance in Images

International Journal of Software and Web Sciences (IJSWS)

Research Methods in Human Computer Interaction by J. Lazar, J.H. Feng and H. Hochheiser (2010)

Data mining with Ensembl Biomart. Stéphanie Le Gras

Historical genealogy of the concept of "risk"

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

Clinical Examples as Non-uniform Learning and Testing Sets

Kai-Wei Chang UCLA. What It Takes to Control Societal Bias in Natural Language Processing. References:

A Computational Model of Counterfactual Thinking: The Temporal Order Effect

Handling uncertainty: what can different disciplines learn from each other?

These Terms Synonym Term Manual Used Relationship

Chapter 1. Lesson 2. Leadership Reshuffled. What You Will Learn to Do. Linked Core Abilities. Skills and Knowledge You Will Gain Along the Way

Characterizing Concepts in Taxonomy for Entity Recommendations

Agreement Coefficients and Statistical Inference

The MyPlate Shuffle. 30 min

Looking for Subjectivity in Medical Discharge Summaries The Obesity NLP i2b2 Challenge (2008)

Implications of Inter-Rater Agreement on a Student Information Retrieval Evaluation

Understanding Textual Uncertainty in Dates Using Interactive Timelines

UKParl: A Data Set for Topic Detection with Semantically Annotated Text

Cultural Competence: An Ethical Model for Big Data Research

Study of Religion 2008

6/3/18 ANXIETAL UNCERTAINTY: INCREASING THE PUBLIC FILTER UNCERTAINTY IN HEALTHCARE UNCERTAINTY IN HEALTHCARE

INCEpTION Workshop: Preference Learning

Annotating If Authors of Tweets are Located in the Locations They Tweet About

Comparative Effectiveness Research Collaborative Initiative (CER-CI) PART 1: INTERPRETING OUTCOMES RESEARCH STUDIES FOR HEALTH CARE DECISION MAKERS

Unsupervised Measurement of Translation Quality Using Multi-engine, Bi-directional Translation

I. Introduction. Armin Falk IZA and University of Bonn April Falk: Behavioral Labor Economics: Psychology of Incentives 1/18

Part II II Evaluating performance in recommender systems

Transcription:

A Language Modeling Approach for Temporal Information Needs Klaus Berberich, Srikanta Bedathur Omar Alonso, Gerhard Weikum 2010, March 29th ECIR 2010 Milton Keynes

Motivation Information needs having a temporal dimension, e.g.: Queries containing a temporal expression (e.g., in 1998) FIFA World Cup tournaments of the 1990 s Movies that won an Academy Award in 2007 Crusades of the 12th century indicate an underlying temporal information need make up 1.5% of general web queries (Nunes et al. 08) are more common for specific domains (e.g., News or Sports) and/or specific user groups (e.g., journalists or historians) But: Not well-supported by existing retrieval models!

Outline Motivation Temporal Expressions Language Models for Temporal Information Needs Experimental Evaluation Conclusion

Temporal Expressions Temporal expressions are frequent across many classes of documents and can be categorized as: explicit (e.g., March 29th, 2010 or September 1872) implicit (e.g., Christmas 2009 or New Year s Eve 2000) relative (e.g., yesterday, last month, or in January) Off-the-shelf tools available to identify and interpret temporal expressions (e.g., TARSQI and TimexTag) TimeML mark-up language specification to annotate temporal expressions found in a document

Challenges Existing retrieval models ignore temporal expressions and their meaning and therefore fail to match, e.g.: fifa world cup 1990 s During the 90 s the FIFA World Cup was won by Germany (in 1990), Brazil (in 1994), and France (in 1998) Meaning of a temporal expression is often uncertain, e.g.: France won the FIFA World Cup in 1998 In 1998 Bill Clinton was President of the U.S. Nagano hosted the Winter Olympics in 1998

Temporal Expression Model We formally model temporal expressions as 4-tuples: T =(tb l, tb u, te l, te u ) that record the earliest/latest begin/end of time intervals that T may refer to The temporal expression T may thus refer to any time interval [b, e] such that b e tb l b tb u te l e te u In 1998, e.g, is represented as ( 1998/01/01, 1998/12/31, 1998/01/01, 1998/12/31 )

Outline Motivation Temporal Expressions Language Models for Temporal Information Needs Experimental Evaluation Conclusion

Document & Query Model We distinguish between the textual part and temporal part of a document d text d time is a bag of textual terms is a bag of temporal expressions We distinguish between the textual part and temporal part of a query q text is a bag of textual terms q time is a bag of temporal expressions Two modes of deriving a query from the user s input q text inclusive mode: includes all input terms q text exclusive mode: excludes input terms that are part of a temporal expression

Language Model Framework Query-likelihood approach assuming independent generation of textual and temporal query part P( q d )=P( q text d text ) P( q time d time ) P( q text d text ) estimated using a unigram language model with Jelinek-Mercer smoothing as λ P( q d text )+(1 λ) P( q C) q q text

Language Model Framework We assume that query temporal expressions are generated independently from each other P( q time d time )= P( Q d time ) Q q time Two-step generation of query temporal expression Q (1) Draw a temporal expression T at uniform random 1 P( Q d time )= P( Q T) d time T d time (II) Generate Q from T

Uncertainty-ignorant Approach The temporal expression T only generates itself P( Q T) = 1 : Q = T 0 : otherwise Ignores temporal expressions inherent uncertainty Profits from the extraction of temporal expressions fifa world cup 1990 s During the 90 s the FIFA World Cup was won by Germany (in 1990), Brazil (in 1994), and France (in 1998)

Uncertainty-aware Approach We assume that any time interval [q b,q e ] that the query temporal expression Q may refer to is equally likely P( Q T )= 1 P([q b,q e ] T ) Q [q b,q e ] Q The temporal expression T generates any time interval [q b,q e ] that it may refer to with equal probability 1/ T : [qb,q P([q b,q e ] T )= e ] T 0 : otherwise

Uncertainty-aware Approach Intuitively, P( Q T) reflects the probability that the user issuing the query and the author writing the document had the same time interval in mind The definition can be simplified as P( Q T )= T Q T Q treating Q and T as sets of time intervals P( Q T) can be computed efficiently without materializing the huge but finite sets of time intervals

Uncertainty-aware Approach Considers temporal expressions inherent uncertainty Profits from the extraction of temporal expressions fifa world cup 1990 s During the 90 s the FIFA World Cup was won by Germany (in 1990), Brazil (in 1994), and France (in 1998)

Outline Motivation Temporal Expressions Language Models for Temporal Information Needs Experimental Evaluation Conclusion

Datasets & Methods Document collections: New York Times Annotated Corpus English Wikipedia (as of July 09) Temporal expressions annotated using TARSQI 40 queries collected using Amazon Mechanical Turk Five temporal granularities (Day, Month, Year, Decade, Century) Four topics (Sports, Culture, Technology, World Affairs) Methods under comparison: LM Unigram LM LMT-IN / LMT-EX Uncertainty-ignorant LM LMTU-IN / LMTU-EX Uncertainty-aware LM

Datasets & Methods Document collections: New York Times Annotated Corpus English Wikipedia (as of July 09) Temporal expressions annotated using TARSQI 40 queries collected using Amazon Mechanical Turk Five temporal granularities (Day, Month, Year, Decade, Century) Four topics (Sports, Culture, Technology, World Affairs) Methods under comparison: LM Unigram LM LMT-IN / LMT-EX Uncertainty-ignorant LM LMTU-IN / LMTU-EX Uncertainty-aware LM

Datasets & Methods Document collections: New York Times Annotated Corpus English Wikipedia (as of July 09) Temporal expressions annotated using TARSQI 40 queries collected using Amazon Mechanical Turk Five temporal granularities (Day, Month, Year, Decade, Century) Four topics (Sports, Culture, Technology, World Affairs) Methods under comparison: LM Unigram LM LMT-IN / LMT-EX Uncertainty-ignorant LM LMTU-IN / LMTU-EX Uncertainty-aware LM

Datasets & Methods Document collections: New York Times Annotated Corpus English Wikipedia (as of July 09) Temporal expressions annotated using TARSQI 40 queries collected using Amazon Mechanical Turk Five temporal granularities (Day, Month, Year, Decade, Century) Four topics (Sports, Culture, Technology, World Affairs) boston red sox [october 27, 2004] kurt cobain [april 5, 1994] pink floyd [march 1973] babe ruth [1921] michael jordan [1990s] mickey mouse [1930s] soccer Methods [21st century] under comparison: jazz music [21st century] voyager LM [september 5, 1977] Unigram LM berlin [october 27, 1961] poland [december 1970] wright brothers [1905] sewing machine [1850s] siemens [19th century] LMT-IN / LMT-EX Uncertainty-ignorant LM LMTU-IN / LMTU-EX Uncertainty-aware LM

Datasets & Methods Document collections: New York Times Annotated Corpus English Wikipedia (as of July 09) Temporal expressions annotated using TARSQI 40 queries collected using Amazon Mechanical Turk Five temporal granularities (Day, Month, Year, Decade, Century) Four topics (Sports, Culture, Technology, World Affairs) Methods under comparison: LM Unigram LM LMT-IN / LMT-EX Uncertainty-ignorant LM LMTU-IN / LMTU-EX Uncertainty-aware LM

Relevance Assessments & Measures Relevance assessments on pooled query results collected using Amazon Mechanical Turk binary relevance assessments with I don t know option mandatory justification of relevance assessment three assessors per query-document pair Fleiss Kappa statistic indicates fair level of agreement New York Times Annotated Corpus (0.36) English Wikipedia (0.40) We measure retrieval effectiveness using Precision @ 10 (P@10) Normalized Discounted Cumulative Gain @ 10 (ndcg@10)

Relevance Assessments & Measures Relevance assessments on pooled query results collected using Amazon Mechanical Turk binary relevance assessments with I don t know option mandatory justification of relevance assessment three assessors per query-document pair Fleiss Kappa statistic indicates fair level of agreement New York Times Annotated Corpus (0.36) English Wikipedia (0.40) We measure retrieval effectiveness using Precision @ 10 (P@10) Normalized Discounted Cumulative Gain @ 10 (ndcg@10)

Relevance Assessments & Measures Relevance assessments on pooled query results collected using Amazon Mechanical Turk binary relevance assessments with I don t know option mandatory justification of relevance assessment three assessors per query-document pair Fleiss Kappa statistic indicates fair level of agreement New York Times Annotated Corpus (0.36) English Wikipedia (0.40) We measure retrieval effectiveness using Precision @ 10 (P@10) Normalized Discounted Cumulative Gain @ 10 (ndcg@10)

Retrieval Effectiveness Overall 0.6 0.4 0.2 0.37 0.25 0.32 0.39 0.51 0.38 0.28 0.33 0.4 0.49 0 P@10 ndcg@10 New York Times LM LMT-IN LMT-EX LMTU-IN LMTU-EX

Retrieval Effectiveness Overall 0.6 0.4 0.2 0.51 0.33 0.56 0.48 0.48 0.36 0.34 0.36 0.46 0.51 0 P@10 ndcg@10 Wikipedia LM LMT-IN LMT-EX LMTU-IN LMTU-EX

Retrieval Effectiveness by Topic 0.6 0.4 0.2 0 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 Sports Culture Technology World Affairs New York Times LM LMT-IN LMT-EX LMTU-IN LMTU-EX

Retrieval Effectiveness by Topic 0.6 0.4 0.2 0 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 Sports Culture Technology World Affairs Wikipedia LM LMT-IN LMT-EX LMTU-IN LMTU-EX

Retrieval Effectiveness by Granularity 0.8 0.6 0.4 0.2 0 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 Day Month Year Decade Century New York Times LM LMT-IN LMT-EX LMTU-IN LMTU-EX

Retrieval Effectiveness by Granularity 0.8 0.6 0.4 0.2 0 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 P@10 ndcg@10 Day Month Year Decade Century Wikipedia LM LMT-IN LMT-EX LMTU-IN LMTU-EX

Outline Motivation Temporal Expressions Language Models for Temporal Information Needs Experimental Evaluation Conclusion

Conclusion Our approach integrates temporal expressions seamlessly into a language modeling framework Experiments show that temporal expressions are helpful to better satisfy temporal information needs if their inherent uncertainty is taken into account More (details on) experiments and links to download extracted temporal expressions and relevance assessments available in our technical report!

Thanks! Questions? http://www.mpi-inf.mpg.de/~kberberi/ecir2010/