Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 148 papers
  • Multi-Modal Answer Validation for Knowledge-Based VQA

    Jialin Wu, Jiasen Lu, Ashish Sabharwal, R. MottaghiAAAI2022 The problem of knowledge-based visual question answering involves answering questions that require external knowledge in addition to the content of the image. Such knowledge typically comes in a variety of forms, including visual, textual, and commonsense…
  • DREAM: Uncovering Mental Models behind Language Models

    Yuling Gu, Bhavana Dalvi, Peter ClarkarXiv2021 (e.g., questions about a specific ethical dilemma)? While cognitive science has shown that mental models play a fundamental role in human problemsolving, it is unclear whether the high questionanswering performance of existing LMs is backed by similar model…
  • Dyna-bAbI: unlocking bAbI’s potential with dynamic synthetic benchmarking

    Ronen Tamari, Kyle Richardson, Aviad Sar-Shalom, Noam Kahlon, Nelson H S Liu, Reut Tsarfaty, Dafna Shahaf arXiv2021 While neural language models often perform surprisingly well on natural language understanding (NLU) tasks, their strengths and limitations remain poorly understood. Controlled synthetic tasks are thus an increasingly important resource for diagnosing model…
  • BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

    Nora Kassner, Oyvind Tafjord, H. Schutze, P. ClarkEMNLP2021 Although pretrained language models (PTLMs) have been shown to contain significant amounts of world knowledge, they can still produce inconsistent answers to questions when probed, even after using specialized training techniques to reduce inconsistency. As a…
  • Explaining Answers with Entailment Trees

    Bhavana Dalvi, Peter A. Jansen, Oyvind Tafjord, Zhengnan Xie, Hannah Smith, Leighanna Pipatanangkura, Peter ClarkEMNLP2021 Our goal, in the context of open-domain textual question-answering (QA), is to explain answers by not just listing supporting textual evidence (“rationales”), but also showing how such evidence leads to the answer in a systematic way. If this could be done…
  • GooAQ: Open Question Answering with Diverse Answer Types

    Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hanna Hajishirzi, Chris Callison-BurchFindings of EMNLP2021 While day-to-day questions come with a variety of answer types, the current questionanswering (QA) literature has failed to adequately address the answer diversity of questions. To this end, we present GOOAQ, a large-scale dataset with a variety of answer…
  • How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI

    A. Kalyan, Abhinav Kumar, Arjun Chandrasekaran, Ashish Sabharwal, Peter ClarkEMNLP2021 Many real-world problems require the combined application of multiple reasoning abilities employing suitable abstractions, commonsense knowledge, and creative synthesis of problem-solving strategies. To help advance AI systems towards such capabilities, we…
  • proScript: Partially Ordered Scripts Generation

    Keisuke Sakaguchi, Chandra Bhagavatula, R. L. Bras, Niket Tandon, P. Clark, Yejin ChoiFindings of EMNLP2021 Scripts standardized event sequences describing typical everyday activities have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information. However, to date they have proved hard to author or…
  • Learning to Solve Complex Tasks by Talking to Agents

    Tushar Khot, Kyle Richardson, Daniel Khashabi, Ashish SabharwalarXiv2021 Humans often solve complex problems by interacting (in natural language) with existing agents, such as AI assistants, that can solve simpler sub-tasks. These agents themselves can be powerful systems built using extensive resources and privately held data. In…
  • Ethical-Advice Taker: Do Language Models Understand Natural Language Interventions?

    Jieyu Zhao, Daniel Khashabi, Tushar Khot, Ashish Sabharwal and Kai-Wei Chang ACL-IJCNLP2021 Is it possible to use natural language to intervene in a model’s behavior and alter its prediction in a desired way? We investigate the effectiveness of natural language interventions for reading-comprehension systems, studying this in the context of social…