Papers

Learn more about AI2's Lasting Impact Award
Viewing 31-40 of 926 papers
  • Nonparametric Masked Language Modeling

    Sewon Min, Weijia Shi, M. Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke ZettlemoyerACL • Findings2023 Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first nonparametric masked language model that replaces this softmax with a…
  • One Embedder, Any Task: Instruction-Finetuned Text Embeddings

    Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao YuACL • Findings2023 We introduce INSTRUCTOR, a new method for computing text embeddings given task instructions: every text input is embedded together with instructions explaining the use case (e.g., task and domain descriptions). Unlike encoders from prior work that are more…
  • PuMer: Pruning and Merging Tokens for Efficient Vision Language Models

    Qingqing Cao, Bhargavi Paranjape, Hanna HajishirziACL2023 Large-scale vision language (VL) models use Transformers to perform cross-modal interactions between the input text and image. These cross-modal interactions are computationally expensive and memory-intensive due to the quadratic complexity of processing the…
  • Risks and NLP Design: A Case Study on Procedural Document QA

    Nikita Haduong, Alice Gao, Noah A. SmithACL • Findings2023 As NLP systems are increasingly deployed at scale, concerns about their potential negative impacts have attracted the attention of the research community, yet discussions of risk have mostly been at an abstract level and focused on generic AI or NLP…
  • Riveter: Measuring Power and Social Dynamics Between Entities

    Maria Antoniak, Anjalie Field, Jimin Mun, Melanie Walsh, Lauren F. Klein, Maarten SapACL2023 Riveter provides a complete easy-to-use pipeline for analyzing verb connotations associated with entities in text corpora. We prepopulate the package with connotation frames of sentiment, power, and agency, which have demonstrated usefulness for capturing…
  • RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

    Afra Feyza Akyurek, Ekin Akyürek, Aman Madaan, A. Kalyan, Peter Clark, D. Wijaya, Niket TandonAnnual Meeting of the Association for Computational Linguistics2023 Despite their unprecedented success, even the largest language models make mistakes.Similar to how humans learn and improve using feedback, previous work proposed providing language models with natural language feedback to guide them in repairing their…
  • Self-Instruct: Aligning Language Models with Self-Generated Instructions

    Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh HajishirziACL2023 Large “instruction-tuned” language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in…
  • Stubborn Lexical Bias in Data and Models

    Sofia Serrano, Jesse Dodge, Noah A. SmithACL2023 In NLP, recent work has seen increased focus on spurious correlations between various features and labels in training data, and how these influence model behavior. However, the presence and effect of such correlations are typically examined feature by feature…
  • Task-aware Retrieval with Instructions

    Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard, Sebastian Riedel, Hannaneh Hajishirzi, Wen-tau YihACL • Findings2023 We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can…
  • When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

    Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh HajishirziACL2023 Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the difficulty of encoding a wealth of world knowledge in their parameters. This paper aims to understand LMs…