Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 101-110 of 292 papers
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso, Tianchu Ji, Ji-Ung Lee, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Pedro Henrique Martins, André F. T. Martins, Peter Milder, Colin Raffel, Jessica Zosa Forde, Emma Strubell, Edwin Simpson, N. Slonim, Jesse Dodge, Niranjan Balasubramanian, Iryna Gurevych, Leon Derczynski, Roy SchwartzarXiv • 2022 Getting the most out of limited resources allows advances in natural language processing (NLP) research and practice while being con-servative with resources. Those resources may be data, time, storage, or energy. Recent work in NLP has yielded interesting…Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
Akari Asai, Matt Gardner, Hannaneh HajishirziNAACL • 2022 Retrieval-augmented generation models have shown state-of-the-art performance across many knowledge-intensive NLP tasks such as open-domain question answering and fact verification. These models are trained to generate a final output given retrieved passages…FaVIQ: FAct Verification from Information-seeking Questions
Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh HajishirziACL • 2022 Despite significant interest in developing general purpose fact checking models, it is challenging to construct a large-scale fact verification dataset with realistic real-world claims. Existing claims are either authored by crowdworkers, thereby introducing…MetaICL: Learning to Learn In Context
Sewon Min, M. Lewis, Luke Zettlemoyer, Hannaneh HajishirziNAACL • 2022 We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks. This meta-training enables the model to…Noisy Channel Language Model Prompting for Few-Shot Text Classification
Sewon Min, Michael Lewis, Hannaneh Hajishirzi, Luke ZettlemoyerACL • 2022 We introduce a noisy channel approach for language model prompting in few-shot text classification. Instead of computing the likelihood of the label given the input (referred as direct models), channel models compute the conditional probability of the input…Robust fine-tuning of zero-shot models
Mitchell Wortsman, Gabriel Ilharco, Mike Li, Jong Wook Kim, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig SchmidtCVPR • 2022Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve…Best Paper FinalistAnnotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. SmithNAACL • 2022 Warning : this paper discusses and contains content that is offensive or upsetting. The perceived toxicity of language can vary based on someone’s identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in…Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Yejin Choi, Noah A. SmithNAACL • 2022 Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models…DEMix Layers: Disentangling Domains for Modular Language Modeling
Suchin Gururangan, Michael Lewis, Ari Holtzman, Noah A. Smith, Luke ZettlemoyerNAACL • 2022 We introduce a new domain expert mixture (DEMIX) layer that enables conditioning a language model (LM) on the domain of the input text. A DEMIX layer is a collection of expert feedforward networks, each specialized to a domain, that makes the LM modular…Efficient Hierarchical Domain Adaptation for Pretrained Language Models
Alexandra Chronopoulou, Matthew E. Peters, Jesse DodgeNAACL • 2022 The remarkable success of large language models has been driven by dense models trained on massive unlabeled, unstructured corpora. These corpora typically contain text from diverse, heterogeneous sources, but information about the source of the text is…