Papers

Learn more about AI2's Lasting Impact Award
Viewing 691-700 of 988 papers
  • BERT for Coreference Resolution: Baselines and Analysis

    Mandar Joshi, Omer Levy, Daniel S. Weld, Luke ZettlemoyerEMNLP2019 We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at…
  • BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle

    Peter West, Ari Holtzman, Jan Buys, Yejin ChoiEMNLP2019 The principle of the Information Bottleneck (Tishby et al. 1999) is to produce a summary of information X optimized to predict some other relevant information Y. In this paper, we propose a novel approach to unsupervised sentence summarization by mapping the…
  • COSMOS QA: Machine Reading Comprehension with Contextual Commonsense Reasoning

    Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin ChoiEMNLP2019 Understanding narratives requires reading between the lines, which in turn, requires interpreting the likely causes and effects of events, even when they are not mentioned explicitly. In this paper, we introduce Cosmos QA, a large-scale dataset of 35,600…
  • Counterfactual Story Reasoning and Generation

    Lianhui Qin, Antoine Bosselut, Ari Holtzman, Chandra Bhagavatula, Elizabeth Clark, Yejin ChoiEMNLP2019 Counterfactual reasoning requires predicting how alternative events, contrary to what actually happened, might have resulted in different outcomes. Despite being considered a necessary component of AI-complete systems, few resources have been developed for…
  • Do NLP Models Know Numbers? Probing Numeracy in Embeddings

    Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, Matt GardnerEMNLP2019 The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks. Currently, most NLP models treat numbers in text in the same way as other tokens---they embed them as distributed vectors. Is this enough to capture…
  • Don't paraphrase, detect! Rapid and Effective Data Collection for Semantic Parsing

    Jonathan Herzig, Jonathan BerantEMNLP2019 A major hurdle on the road to conversational interfaces is the difficulty in collecting data that maps language utterances to logical forms. One prominent approach for data collection has been to automatically generate pseudo-language paired with logical…
  • Efficient Navigation with Language Pre-training and Stochastic Sampling

    Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah Smith, Yejin ChoiEMNLP2019 Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments. In this paper, we report two simple but highly…
  • Entity, Relation, and Event Extraction with Contextualized Span Representations

    David Wadden, Ulme Wennberg, Yi Luan, Hannaneh HajishirziEMNLP2019 We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. Our framework (called DyGIE++) accomplishes all tasks by enumerating, refining, and…
  • Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text

    Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter ClarkEMNLP2019 Our goal is to better comprehend procedural text, e.g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others. Our approach builds on a prior process comprehension framework for predicting…
  • Global Reasoning over Database Structures for Text-to-SQL Parsing

    Ben Bogin, Matt Gardner, Jonathan BerantEMNLP2019 State-of-the-art semantic parsers rely on auto-regressive decoding, emitting one symbol at a time. When tested against complex databases that are unobserved at training time (zero-shot), the parser often struggles to select the correct set of database…