Papers

Learn more about AI2's Lasting Impact Award
Viewing 301-310 of 991 papers
  • Multimodal Knowledge Alignment with Reinforcement Learning

    Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, J. Park, Ximing Lu, Prithviraj Ammanabrolu, Rowan Zellers, Ronan Le Bras, Gunhee Kim, Yejin ChoiCVPR2022 Large language models readily adapt to novel settings, even without task-specific training data. Can their zero-shot capacity be extended to multimodal inputs? In this work, we propose ESPER which extends language-only zero-shot models to unseen multimodal…
  • NaturalProver: Grounded Mathematical Proof Generation with Language Models

    S. Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin ChoiarXiv2022 Theorem proving in natural mathematical language - the mixture of symbolic and natural language used by humans - plays a central role in mathematical advances and education, and tests aspects of reasoning that are core to intelligence. Yet it has remained…
  • ProsocialDialog: A Prosocial Backbone for Conversational Agents

    Hyunwoo Kim, Youngjae Yu, Liwei Jiang, Ximing Lu, Daniel Khashabi, Gunhee Kim, Yejin Choi, Maarten SapEMNLP2022 Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them. To address this issue, we introduce ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach…
  • Zero- and Few-Shot NLP with Pretrained Language Models

    Iz Beltagy, Arman Cohan, Robert Logan IV, Sewon Min, Sameer SinghACL, tutorial2022 The ability to efficiently learn from little-to-no data is critical to applying NLP to tasks where data collection is costly or otherwise difficult. This is a challenging setting both academically and practically—particularly because training neutral models…
  • Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations

    Jaehun Jung, Lianhui Qin, S. Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Yejin ChoiEMNLP2022 Despite their impressive capabilities, large pretrained language models (LMs) struggle with consistent reasoning; recently, prompting LMs to generate explanations that self-guide the inference has emerged as a promising direction to amend this. However, these…
  • Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions

    Emily Allaway, Jena D. Hwang, Chandra Bhagavatula, K. McKeown, Doug Downey, Yejin ChoiarXiv2022 Generics express generalizations about the world (e.g., “birds can fly"). However, they are not universally true – while sparrows and penguins are both birds, only sparrows can fly and penguins cannot. Commonsense knowledge bases, that are used extensively in…
  • ABC: Attention with Bounded-memory Control

    Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, Noah A. SmithACL2022 Transformer architectures have achieved state-of-the-art results on a variety of sequence modeling tasks. However, their attention mechanism comes with a quadratic complexity in sequence lengths, making the computational overhead prohibitive, especially for…
  • Cross-Task Generalization via Natural Language Crowdsourcing Instructions

    Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hanna HajishirziACL2022 Can we enable NLP models to appropriately respond to instructional prompts and consequently generalize to new tasks? To study this question, we leverage the existing NLP datasets and the instructions that were used to crowdsource them to create…
  • Draw Me a Flower: Grounding Formal Abstract Structures Stated in Informal Natural Language

    Royi Lachmy, Valentina Pyatkin, Reut TsarfatyACL2022 Forming and interpreting abstraction is a core process in human communication. In particular, when giving and performing complex instructions stated in natural language (NL), people may naturally evoke abstract constructs such as objects, loops, conditions…
  • Extracting Latent Steering Vectors from Pretrained Language Models

    Nishant Subramani, Nivedita Suresh, Matthew E. PetersFindings of ACL 2022 Prior work on controllable text generation has focused on learning how to control language models through trainable decoding, smart-prompt design, or fine-tuning based on a desired objective. We hypothesize that the information needed to steer the model to…