Award Winning Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 29 papers
  • Robust fine-tuning of zero-shot models

    Mitchell Wortsman, Gabriel Ilharco, Mike Li, Jong Wook Kim, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig SchmidtCVPR2022
    Best Paper Finalist
    Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning methods substantially improve…
  • NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

    Ximing Lu, S. Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, Yejin ChoiNAACL2022
    Best Paper Award
    The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing…
  • Understanding Dataset Difficulty with 𝒱-Usable Information

    Kawin Ethayarajh, Yejin Choi, and Swabha SwayamdiptaICML2022 Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison provides little understanding of how difficult each instance…
  • Hallett‐Mossop Rime Splintering Dims Cumulus Clouds Over the Southern Ocean: New Insight From Nudged Global Storm‐Resolving Simulations

    R. Atlas, C. Bretherton, M. Khairoutdinov, P. BlosseyAGU Advances2022 In clouds containing both liquid and ice with temperatures between −3°C and −8°C, liquid droplets collide with large ice crystals, freeze, and shatter, producing a plethora of small ice splinters. This process, known as Hallett‐Mossop rime splintering, and…
  • Correcting Coarse-Grid Weather and Climate Models by Machine Learning From Global Storm-Resolving Simulations

    Bretherton, C. S., B. Henn, A. Kwa, N. D. Brenowitz, O. Watt-Meyer, J. McGibbon, W. A. Perkins, S. K. Clark, and L. HarrisJournal of Advances in Modeling Earth Systems2022 Global atmospheric `storm-resolving' models with horizontal grid spacing of less than 5~km resolve deep cumulus convection and flow in complex terrain. They promise to be reference models that could be used to improve computationally affordable coarse-grid…
  • MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers

    Krishna Pillutla, Swabha Swayamdipta, Rowan Zellers, John Thickstun, S. Welleck, Yejin Choi, Z. HarchaouiNeurIPS2021 As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE , a comparison measure for open-ended text generation, which directly compares the…
  • Specializing Multilingual Language Models: An Empirical Study

    Ethan C. Chau, Noah A. SmithEMNLP • Workshop on Multilingual Representation Learning2021
    Best Paper Honorable Mention
    Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance, extensibility, and interaction of two such adaptations: vocabulary…
  • SciA11y: Converting Scientific Papers to Accessible HTML

    Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie (Yu-Yen) Cheng, Chelsea Hess Haupt, Matt Latzke, Bailey Kuehl, Madeleine van Zuylen, Linda M. Wagner, Daniel S. WeldASSETS2021
    Best Artifact Award
    We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML. SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes the resulting paper components into a form that better…
  • SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts

    Arie Cattan, Sophie Johnson, Daniel S. Weld, Ido Dagan, Iz Beltagy, Doug Downey, Tom HopeAKBC2021 Determining coreference of concept mentions across multiple documents is fundamental for natural language understanding. Work on cross-document coreference resolution (CDCR) typically considers mentions of events in the news, which do not often involve…
  • All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text

    Elizabeth Clark, Tal August, Sofia Serrano, Nikita Haduong, Suchin Gururangan, Noah A. SmithACL2021 Human evaluations are typically considered the gold standard in natural language generation, but as models' fluency improves, how well can evaluators detect and judge machine-generated text? We run a study assessing non-experts' ability to distinguish between…