Papers

Learn more about AI2's Lasting Impact Award
Viewing 41-50 of 292 papers
  • Efficient Methods for Natural Language Processing: A Survey

    Marcos Vinícius Treviso, Tianchu Ji, Ji-Ung Lee, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Pedro Henrique Martins, André F. T. Martins, Peter Milder, Colin Raffel, Edwin Simpson, N. Slonim, Niranjan Balasubramanian, Leon Derczynski, Roy SchwartzTACL2023 Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data, time…
  • Elaboration-Generating Commonsense Question Answering at Scale

    Wenya Wang, Vivek Srikumar, Hannaneh Hajishirzi, Noah A. SmithACL2023 In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune…
  • Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation

    Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, D. Klakow, Yanai ElazarFindings of ACL 20232023 Few-shot fine-tuning and in-context learning are two alternative strategies for task adaptation of pre-trained language models. Recently, in-context learning has gained popularity over fine-tuning due to its simplicity and improved out-of-domain…
  • FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning

    Qinyuan Ye, Iz Beltagy, Matthew E. Peters, Xiang Ren, Hannaneh HajishirziACL2023 Large pre-trained models are capable of few-shot in-context learning (ICL), i.e., performing a new task by prepending a few demonstrations before the test input. However, the concatenated demonstrations are often excessively long and induce additional…
  • HINT: Hypernetwork Instruction Tuning for Efficient Zero-Shot Generalisation

    Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, Matthew E. PetersACL2023 Recent NLP models have the great ability to generalise ‘zero-shot’ to new tasks using only an instruction as guidance. However, these approaches usually repeat their instructions with every input, requiring costly reprocessing of lengthy instructions for…
  • NarrowBERT: Accelerating Masked Language Model Pretraining and Inference

    Haoxin Li, Phillip Keung, Daniel Cheng, Jungo Kasai, Noah A. SmithACL • Proceedings2023 Large-scale language model pretraining is a very successful form of self-supervised learning in natural language processing, but it is increasingly expensive to perform as the models and pretraining corpora have become larger over time. We propose NarrowBERT…
  • Nonparametric Masked Language Modeling

    Sewon Min, Weijia Shi, M. Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke ZettlemoyerACL • Findings2023 Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first nonparametric masked language model that replaces this softmax with a…
  • One Embedder, Any Task: Instruction-Finetuned Text Embeddings

    Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao YuACL • Findings2023 We introduce INSTRUCTOR, a new method for computing text embeddings given task instructions: every text input is embedded together with instructions explaining the use case (e.g., task and domain descriptions). Unlike encoders from prior work that are more…
  • PuMer: Pruning and Merging Tokens for Efficient Vision Language Models

    Qingqing Cao, Bhargavi Paranjape, Hanna HajishirziACL2023 Large-scale vision language (VL) models use Transformers to perform cross-modal interactions between the input text and image. These cross-modal interactions are computationally expensive and memory-intensive due to the quadratic complexity of processing the…
  • Risks and NLP Design: A Case Study on Procedural Document QA

    Nikita Haduong, Alice Gao, Noah A. SmithACL • Findings2023 As NLP systems are increasingly deployed at scale, concerns about their potential negative impacts have attracted the attention of the research community, yet discussions of risk have mostly been at an abstract level and focused on generic AI or NLP…