Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 31-40 of 926 papers
Nonparametric Masked Language Modeling
Sewon Min, Weijia Shi, M. Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke ZettlemoyerACL • Findings • 2023 Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first nonparametric masked language model that replaces this softmax with a…One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao YuACL • Findings • 2023 We introduce INSTRUCTOR, a new method for computing text embeddings given task instructions: every text input is embedded together with instructions explaining the use case (e.g., task and domain descriptions). Unlike encoders from prior work that are more…PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
Qingqing Cao, Bhargavi Paranjape, Hanna HajishirziACL • 2023 Large-scale vision language (VL) models use Transformers to perform cross-modal interactions between the input text and image. These cross-modal interactions are computationally expensive and memory-intensive due to the quadratic complexity of processing the…Risks and NLP Design: A Case Study on Procedural Document QA
Nikita Haduong, Alice Gao, Noah A. SmithACL • Findings • 2023 As NLP systems are increasingly deployed at scale, concerns about their potential negative impacts have attracted the attention of the research community, yet discussions of risk have mostly been at an abstract level and focused on generic AI or NLP…Riveter: Measuring Power and Social Dynamics Between Entities
Maria Antoniak, Anjalie Field, Jimin Mun, Melanie Walsh, Lauren F. Klein, Maarten SapACL • 2023 Riveter provides a complete easy-to-use pipeline for analyzing verb connotations associated with entities in text corpora. We prepopulate the package with connotation frames of sentiment, power, and agency, which have demonstrated usefulness for capturing…RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
Afra Feyza Akyurek, Ekin Akyürek, Aman Madaan, A. Kalyan, Peter Clark, D. Wijaya, Niket TandonAnnual Meeting of the Association for Computational Linguistics • 2023 Despite their unprecedented success, even the largest language models make mistakes.Similar to how humans learn and improve using feedback, previous work proposed providing language models with natural language feedback to guide them in repairing their…Self-Instruct: Aligning Language Models with Self-Generated Instructions
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh HajishirziACL • 2023 Large “instruction-tuned” language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in…Stubborn Lexical Bias in Data and Models
Sofia Serrano, Jesse Dodge, Noah A. SmithACL • 2023 In NLP, recent work has seen increased focus on spurious correlations between various features and labels in training data, and how these influence model behavior. However, the presence and effect of such correlations are typically examined feature by feature…Task-aware Retrieval with Instructions
Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard, Sebastian Riedel, Hannaneh Hajishirzi, Wen-tau YihACL • Findings • 2023 We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction tuning, which can…When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh HajishirziACL • 2023 Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the difficulty of encoding a wealth of world knowledge in their parameters. This paper aims to understand LMs…