Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

Abhilasha RavichanderMatt GardnerAna Marasović

2022

EMNLP

The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for…

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Matt DeitkeEli VanderBiltAlvaro HerrastiRoozbeh Mottaghi

2022

NeurIPS

Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories…

Robust fine-tuning of zero-shot models

Mitchell WortsmanGabriel IlharcoMike LiLudwig Schmidt

2022

CVPR

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset).…

NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

Ximing LuS. WelleckPeter WestYejin Choi

2022

NAACL

The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however,…

Understanding Dataset Difficulty with 𝒱-Usable Information

Kawin EthayarajhYejin Choiand Swabha Swayamdipta

2022

ICML

Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison…

Hallett‐Mossop Rime Splintering Dims Cumulus Clouds Over the Southern Ocean: New Insight From Nudged Global Storm‐Resolving Simulations

R. AtlasC. BrethertonM. KhairoutdinovP. Blossey

2022

AGU Advances

In clouds containing both liquid and ice with temperatures between −3°C and −8°C, liquid droplets collide with large ice crystals, freeze, and shatter, producing a plethora of small ice splinters.…

Correcting Coarse-Grid Weather and Climate Models by Machine Learning From Global Storm-Resolving Simulations

BrethertonC. S.B. Hennand L. Harris

2022

Journal of Advances in Modeling Earth Systems

Global atmospheric `storm-resolving' models with horizontal grid spacing of less than 5~km resolve deep cumulus convection and flow in complex terrain. They promise to be reference models that could…

MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers

Krishna PillutlaSwabha SwayamdiptaRowan ZellersZ. Harchaoui

2021

NeurIPS

As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE , a comparison measure…

Specializing Multilingual Language Models: An Empirical Study

Ethan C. ChauNoah A. Smith

2021

EMNLP • Workshop on Multilingual Representation Learning

Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance,…

SciA11y: Converting Scientific Papers to Accessible HTML

Lucy Lu WangIsabel CacholaJonathan BraggDaniel S. Weld

2021

ASSETS

We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML. SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes…

Previous32-41Next