Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 21-30 of 699 papers
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung, Lianhui Qin, S. Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Yejin ChoiarXiv • 2022 Despite their impressive capabilities, large pretrained language models (LMs) struggle with consistent reasoning; recently, prompting LMs to generate explanations that self-guide the inference has emerged as a promising direction to amend this. However, these…ABC: Attention with Bounded-memory Control
Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, Noah A. SmithACL • 2022 Transformer architectures have achieved state-of-the-art results on a variety of sequence modeling tasks. However, their attention mechanism comes with a quadratic complexity in sequence lengths, making the computational overhead prohibitive, especially for…Cross-Task Generalization via Natural Language Crowdsourcing Instructions
Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hanna HajishirziACL • 2022 Can we enable NLP models to appropriately respond to instructional prompts and consequently generalize to new tasks? To study this question, we leverage the existing NLP datasets and the instructions that were used to crowdsource them to create…Extracting Latent Steering Vectors from Pretrained Language Models
Nishant Subramani, Nivedita Suresh, Matthew E. PetersACL FINDINGS • 2022 Prior work on controllable text generation has focused on learning how to control language models through trainable decoding, smart-prompt design, or fine-tuning based on a desired objective. We hypothesize that the information needed to steer the model to…Generated Knowledge Prompting for Commonsense Reasoning
Jiachen Liu, Alisa Liu, Ximing Lu, S. Welleck, Peter West, Ronan Le Bras, Yejin Choi, Hannaneh HajishirziACL • 2022 Despite their ability to capture large amount of knowledge during pretraining, large-scale language models often benefit from incorporating external knowledge bases, especially on commonsense reasoning tasks. This motivates us to explore how we can best…Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets
Yuxiang Wu, Matt Gardner, Pontus Stenetorp, Pradeep DasigiACL • 2022 Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on, while not generalising to different task distributions. We…Generating Scientific Definitions with Controllable Complexity
Tal August, Katharina Reinecke, Noah A. SmithACL • 2022 Unfamiliar terminology and complex language can present barriers to understanding science. Natural language processing stands to help address these issues by automatically defining unfamiliar terms. We introduce a new task and dataset for defining scientific…Hey AI, Can You Solve Complex Tasks by Talking to Agents?
Tushar Khot, Kyle Richardson, Daniel Khashabi, Ashish SabharwalFindings of ACL • 2022 Humans often solve complex problems by interacting (in natural language) with existing agents, such as AI assistants, that can solve simpler sub-tasks. These agents themselves can be powerful systems built using extensive resources and privately held data. In…Large Scale Substitution-based Word Sense Induction
Authors: Matan Eyal, Shoval Sadde, Hillel Taub-Tabib, Yoav GoldbergACL • 2022 We present a word-sense induction method based on pre-trained masked language models (MLMs), which can cheaply scale to large vocabularies and large corpora. The result is a corpus which is sense-tagged according to a corpus-derived sense inventory and where…Productive Performance Engineering for Weather and Climate Modeling with Python
Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann P. S. Dahm, Oliver D. Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer, Thomas Schulthess, Torsten HoeflerarXiv • 2022 Earth system models are developed with a tight coupling to target hardware, often containing highly-specialized code predicated on processor characteristics. This coupling stems from using imperative languages that hard-code computation schedules and layout…