Papers

Learn more about AI2's Lasting Impact Award
Viewing 21-30 of 192 papers
  • Zero- and Few-Shot NLP with Pretrained Language Models

    Iz Beltagy, Arman Cohan, Robert Logan IV, Sewon Min, Sameer SinghACL, tutorial2022 The ability to efficiently learn from little-to-no data is critical to applying NLP to tasks where data collection is costly or otherwise difficult. This is a challenging setting both academically and practically—particularly because training neutral models…
  • ABC: Attention with Bounded-memory Control

    Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, Noah A. SmithACL2022 Transformer architectures have achieved state-of-the-art results on a variety of sequence modeling tasks. However, their attention mechanism comes with a quadratic complexity in sequence lengths, making the computational overhead prohibitive, especially for…
  • Cross-Task Generalization via Natural Language Crowdsourcing Instructions

    Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hanna HajishirziACL2022 Can we enable NLP models to appropriately respond to instructional prompts and consequently generalize to new tasks? To study this question, we leverage the existing NLP datasets and the instructions that were used to crowdsource them to create…
  • Extracting Latent Steering Vectors from Pretrained Language Models

    Nishant Subramani, Nivedita Suresh, Matthew E. PetersFindings of ACL 2022 Prior work on controllable text generation has focused on learning how to control language models through trainable decoding, smart-prompt design, or fine-tuning based on a desired objective. We hypothesize that the information needed to steer the model to…
  • Generated Knowledge Prompting for Commonsense Reasoning

    Jiachen Liu, Alisa Liu, Ximing Lu, S. Welleck, Peter West, Ronan Le Bras, Yejin Choi, Hannaneh HajishirziACL2022 Despite their ability to capture large amount of knowledge during pretraining, large-scale language models often benefit from incorporating external knowledge bases, especially on commonsense reasoning tasks. This motivates us to explore how we can best…
  • Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets

    Yuxiang Wu, Matt Gardner, Pontus Stenetorp, Pradeep DasigiACL2022 Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on, while not generalising to different task distributions. We…
  • Generating Scientific Definitions with Controllable Complexity

    Tal August, Katharina Reinecke, Noah A. SmithACL2022 Unfamiliar terminology and complex language can present barriers to understanding science. Natural language processing stands to help address these issues by automatically defining unfamiliar terms. We introduce a new task and dataset for defining scientific…
  • Is GPT-3 Text Indistinguishable from Human Text? SCARECROW: A Framework for Scrutinizing Machine Text

    Yao Dou, Maxwell Forbes, Rik Koncel-Kedziorski, Noah A. Smith, Yejin ChoiACL2022 Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by contemporary models are often semantic, narrative, or discourse failures…
  • Reframing Instructional Prompts to GPTk's Language

    Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hanna HajishirziFindings of ACL2022 How can model designers turn task instructions into effective prompts for language models? Backed by extensive empirical analysis on GPT3, we observe important features for successful instructional prompts, and propose several reframing techniques for model…
  • Twist Decoding: Diverse Generators Guide Each Other

    Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, Noah A. SmitharXiv2022 Natural language generation technology has recently seen remarkable progress with large-scale training, and many natural language applications are now built upon a wide range of generation models. Combining diverse models may lead to further progress, but…