Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 169 papers
  • Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

    Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. SmithNAACL2022 Warning : this paper discusses and contains content that is offensive or upsetting. The perceived toxicity of language can vary based on someone’s identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in…
  • Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

    Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Yejin Choi, Noah A. SmithNAACL2022 Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models…
  • DEMix Layers: Disentangling Domains for Modular Language Modeling

    Suchin Gururangan, Michael Lewis, Ari Holtzman, Noah A. Smith, Luke ZettlemoyerNAACL2022 We introduce a new domain expert mixture (DEMIX) layer that enables conditioning a language model (LM) on the domain of the input text. A DEMIX layer is a collection of expert feedforward networks, each specialized to a domain, that makes the LM modular…
  • Few-Shot Self-Rationalization with Natural Language Prompts

    Ana Marasović, Iz Beltagy, Doug Downey, Matthew E. PetersFindings of NAACL2022 Self-rationalization models that predict task labels and generate free-text elaborations for their predictions could enable more intuitive interaction with NLP systems. These models are, however, currently trained with a large amount of human-written free…
  • NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

    Ximing Lu, S. Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, Yejin ChoiNAACL2022 The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing…
  • Time Waits for No One! Analysis and Challenges of Temporal Misalignment

    Kelvin Luu, Daniel Khashabi, Suchin Gururangan, Karishma Mandyam, Noah A. SmithNAACL2022 When an NLP model is trained on text data from one time period and tested or deployed on data from another, the resulting temporal misalignment can degrade end-task performance. In this work, we establish a suite of eight diverse tasks across different…
  • Transparent Human Evaluation for Image Captioning

    Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Yejin Choi, Noah A. SmithNAACL2022 We establish a rubric-based human evaluation protocol for image captioning models. Our scoring rubrics and their definitions are carefully developed based on machineand humangenerated captions on the MSCOCO dataset. Each caption is evaluated along two main…
  • ABC: Attention with Bounded-memory Control

    Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, Noah A. SmithACL2022 Transformer architectures have achieved state-of-the-art results on a variety of sequence modeling tasks. However, their attention mechanism comes with a quadratic complexity in sequence lengths, making the computational overhead prohibitive, especially for…
  • Generated Knowledge Prompting for Commonsense Reasoning

    Jiachen Liu, Alisa Liu, Ximing Lu, S. Welleck, Peter West, Ronan Le Bras, Yejin Choi, Hannaneh HajishirziACL2022 Despite their ability to capture large amount of knowledge during pretraining, large-scale language models often benefit from incorporating external knowledge bases, especially on commonsense reasoning tasks. This motivates us to explore how we can best…
  • Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets

    Yuxiang Wu, Matt Gardner, Pontus Stenetorp, Pradeep DasigiACL2022 Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on, while not generalising to different task distributions. We…