Papers

Learn more about AI2's Lasting Impact Award
Viewing 761-770 of 988 papers
  • Benchmarking Hierarchical Script Knowledge

    Yonatan Bisk, Jan Buys, Karl Pichotta, Yejin ChoiNAACL2019 Understanding procedural language requires reasoning about both hierarchical and temporal relations between events. For example, “boiling pasta” is a sub-event of “making a pasta dish”, typically happens before “draining pasta,” and requires the use of…
  • Combining Distant and Direct Supervision for Neural Relation Extraction

    Iz Beltagy, Kyle Lo, Waleed AmmarNAACL2019 In relation extraction with distant supervision, noisy labels make it difficult to train quality models. Previous neural models addressed this problem using an attention mechanism that attends to sentences that are likely to express the relations. We improve…
  • CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

    Alon Talmor, Jonathan Herzig, Nicholas Lourie, Jonathan BerantNAACL2019 When answering a question, people often draw upon their rich world knowledge in addition to the particular context. Recent work has focused primarily on answering questions given some relevant document or context, and required very little general background…
  • DiscoFuse: A Large-Scale Dataset for Discourse-based Sentence Fusion

    Mor Geva, Eric Malmi, Idan Szpektor, Jonathan BerantNAACL2019 Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically…
  • DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

    Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt GardnerNAACL-HLT2019 Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for the task. However, a large body of work has highlighted the brittleness of these systems, showing that there is much work left to be done. We…
  • Evaluating Text GANs as Language Models

    Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan BerantNAACL2019 Generative Adversarial Networks (GANs) are a promising approach for text generation that, unlike traditional language models (LM), does not suffer from the problem of “exposure bias”. However, A major hurdle for understanding the potential of GANs for text…
  • Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets

    Nelson F. Liu, Roy Schwartz, Noah SmithNAACL2019 Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks. While model performance on these challenge datasets is significantly lower compared to the original benchmark, it is unclear what particular…
  • Iterative Search for Weakly Supervised Semantic Parsing

    Pradeep Dasigi, Matt Gardner, Shikhar Murty, Luke Zettlemoyer, Ed HovyNAACL2019 Training semantic parsers from question-answer pairs typically involves searching over an exponentially large space of logical forms, and an unguided search can easily be misled by spurious logical forms that coincidentally evaluate to the correct answer. We…
  • Linguistic Knowledge and Transferability of Contextual Representations

    Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew Peters, Noah A. SmithNAACL2019 Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language. To shed light on the linguistic knowledge they capture…
  • Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them

    Hila Gonen, Yoav GoldbergNAACL2019 Word embeddings are widely used in NLP for a vast range of tasks. It was shown that word embeddings derived from text corpora reflect gender biases in society. This phenomenon is pervasive and consistent across different word embedding models, causing serious…