Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 164 papers
  • Dyna-bAbI: unlocking bAbI’s potential with dynamic synthetic benchmarking

    Ronen Tamari, Kyle Richardson, Aviad Sar-Shalom, Noam Kahlon, Nelson H S Liu, Reut Tsarfaty, Dafna Shahaf SEM2022 While neural language models often perform surprisingly well on natural language understanding (NLU) tasks, their strengths and limitations remain poorly understood. Controlled synthetic tasks are thus an increasingly important resource for diagnosing model…
  • Learning to Repair: Repairing model output errors after deployment using a dynamic memory of feedback

    Niket Tandon, Aman Madaan, Peter Clark, Yiming YangFindings of EMNLP 2022 Large language models (LMs), while power-ful, are not immune to mistakes, but can be difficult to retrain. Our goal is for an LM to continue to improve after deployment, without retraining, using feedback from the user. Our approach pairs an LM with (i) a…
  • Log-Precision Transformers are Uniform Threshold Circuits

    William Merrill, Ashish SabharwalarXiv2022 We prove that transformer neural networks with logarithmic precision in the input length (and where the feedforward subnetworks are computable using linear space in their input length) can be simulated by constant-depth uniform threshold circuits. Thus, such…
  • DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

    Gregor Betz, Kyle RichardsonSEM2022 In this paper, we present and implement a multi-dimensional, modular framework for performing deep argument analysis (DeepA2) using current pre-trained language models (PTLMs). ArgumentAnalyst – a T5 model (Raffel et al. 2020) set up and trained within DeepA2…
  • Retrieval Data Augmentation Informed by Downstream Question Answering Performance

    James Ferguson, Pradeep Dasigi, Tushar Khot, Hannaneh HajishirziACL • FEVER2022 Training retrieval models to fetch contexts for Question Answering (QA) over large corpora requires labeling relevant passages in those corpora. Since obtaining exhaustive manual annotations of all relevant passages is not feasible, prior work uses text…
  • Teaching Broad Reasoning Skills via Decomposition-Guided Contexts

    Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish SabharwalarXiv2022 Question-answering datasets require a broad set of reasoning skills. We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion. Specifically, we use widely available QDMR representations to…
  • Cross-Task Generalization via Natural Language Crowdsourcing Instructions

    Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hanna HajishirziACL2022 Can we enable NLP models to appropriately respond to instructional prompts and consequently generalize to new tasks? To study this question, we leverage the existing NLP datasets and the instructions that were used to crowdsource them to create…
  • Hey AI, Can You Solve Complex Tasks by Talking to Agents?

    Tushar Khot, Kyle Richardson, Daniel Khashabi, Ashish SabharwalFindings of ACL2022 Humans often solve complex problems by interacting (in natural language) with existing agents, such as AI assistants, that can solve simpler sub-tasks. These agents themselves can be powerful systems built using extensive resources and privately held data. In…
  • Better Retrieval May Not Lead to Better Question Answering

    Zhengzhong Liang, Tushar Khot, Steven Bethard, Mihai Surdeanu, Ashish SabharwalarXiv2022 Considerable progress has been made recently in open-domain question answering (QA) problems, which require Information Retrieval (IR) and Reading Comprehension (RC). A popular approach to improve the system's performance is to improve the quality of the…
  • Saturated Transformers are Constant-Depth Threshold Circuits

    William Merrill, Ashish Sabharwal, Noah A. SmithTACL2022 Transformers have become a standard neural network architecture for many NLP problems, motivating theoretical analysis of their power in terms of formal languages. Recent work has shown that transformers with *hard* attention are quite limited in power, as…