Papers

Learn more about AI2's Lasting Impact Award
Viewing 11-20 of 192 papers
  • MultiVerS: Improving scientific claim verification with weak supervision and full-document context

    David Wadden, Kyle Lo, Lucy Lu Wang, Arman Cohan, Iz Beltagy, Hannaneh HajishirziNAACL Findings2022 The scientific claim verification task requires an NLP system to label scientific documents which Support or Refute an input claim, and to select evidentiary sentences (or rationales) justifying each predicted label. In this work, we present MultiVerS, which…
  • NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

    Ximing Lu, S. Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, Yejin ChoiNAACL2022
    Best Paper Award
    The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing…
  • Time Waits for No One! Analysis and Challenges of Temporal Misalignment

    Kelvin Luu, Daniel Khashabi, Suchin Gururangan, Karishma Mandyam, Noah A. SmithNAACL2022 When an NLP model is trained on text data from one time period and tested or deployed on data from another, the resulting temporal misalignment can degrade end-task performance. In this work, we establish a suite of eight diverse tasks across different…
  • Transparent Human Evaluation for Image Captioning

    Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Yejin Choi, Noah A. SmithNAACL2022 We establish a rubric-based human evaluation protocol for image captioning models. Our scoring rubrics and their definitions are carefully developed based on machineand humangenerated captions on the MSCOCO dataset. Each caption is evaluated along two main…
  • Data Governance in the Age of Large-Scale Data-Driven Language Technology

    Yacine Jernite, Huu Nguyen, Stella Rose Biderman, A. Rogers, Maraim Masoud, V. Danchev, Samson Tan, A. Luccioni, Nishant Subramani, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Isaac Johnson, Dragomir R. Radev, Somaieh Nikpoor, Jorg Frohberg, Aaron Gokaslan, Peter Henderson, Rishi Bommasani, Margaret MitchellFAccT2022 The recent emergence and adoption of Machine Learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data. This work proposes an approach to global language data…
  • Measuring the Carbon Intensity of AI in Cloud Instances

    Jesse Dodge, Taylor Prewitt, Rémi Tachet des Combes, Erika Odmark, Roy Schwartz, Emma Strubell, A. Luccioni, Noah A. Smith, Nicole DeCario, Will BuchananFAccT2022 The advent of cloud computing has provided people around the world with unprecedented access to computational power and enabled rapid growth in technologies such as machine learning, the computational demands of which incur a high energy cost and a…
  • Domain Mismatch Doesn’t Always Prevent Cross-Lingual Transfer Learning

    Daniel Edmiston, Phillip Keung, Noah A. SmithLREC2022 Cross-lingual transfer learning without labeled target language data or parallel text has been surprisingly effective in zero-shot cross-lingual classification, question answering, unsupervised machine translation, etc. However, some recent publications have…
  • What Language Model to Train if You Have One Million GPU Hours?

    Teven Le Scao, Thomas Wang, Daniel Hesslow, Lucile Saulnier, Stas Bekman, Saiful Bari, Stella Rose Biderman, Hady ElSahar, Jason Phang, Ofir Press, Colin Raffel, Victor Sanh, Sheng Shen, Lintang A. Sutawika, Jaesung Tae, Zheng Xin Yong, Julien Launay, Iz BeltagyACL BigScience Workshop2022 The crystallization of modeling methods around the Transformer architecture has been a boon for practitioners. Simple, well-motivated architectural variations that transfer across tasks and scale, increasing the impact of modeling research. However, with the…
  • Retrieval Data Augmentation Informed by Downstream Question Answering Performance

    James Ferguson, Pradeep Dasigi, Tushar Khot, Hannaneh HajishirziACL • FEVER2022 Training retrieval models to fetch contexts for Question Answering (QA) over large corpora requires labeling relevant passages in those corpora. Since obtaining exhaustive manual annotations of all relevant passages is not feasible, prior work uses text…
  • NaturalProver: Grounded Mathematical Proof Generation with Language Models

    S. Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin ChoiarXiv2022 Theorem proving in natural mathematical language - the mixture of symbolic and natural language used by humans - plays a central role in mathematical advances and education, and tests aspects of reasoning that are core to intelligence. Yet it has remained…