Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Matt DeitkeEli VanderBiltAlvaro HerrastiRoozbeh Mottaghi
2022
NeurIPS

Massive datasets and high-capacity models have driven many recent advancements in computer vision and natural language understanding. This work presents a platform to enable similar success stories… 

Robust fine-tuning of zero-shot models

Mitchell WortsmanGabriel IlharcoMike LiLudwig Schmidt
2022
CVPR

Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset).… 

NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

Ximing LuS. WelleckPeter WestYejin Choi
2022
NAACL

The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however,… 

Understanding Dataset Difficulty with 𝒱-Usable Information

Kawin EthayarajhYejin Choiand Swabha Swayamdipta
2022
ICML

Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison… 

Hallett‐Mossop Rime Splintering Dims Cumulus Clouds Over the Southern Ocean: New Insight From Nudged Global Storm‐Resolving Simulations

R. AtlasC. BrethertonM. KhairoutdinovP. Blossey
2022
AGU Advances

In clouds containing both liquid and ice with temperatures between −3°C and −8°C, liquid droplets collide with large ice crystals, freeze, and shatter, producing a plethora of small ice splinters.… 

Correcting Coarse-Grid Weather and Climate Models by Machine Learning From Global Storm-Resolving Simulations

BrethertonC. S.B. Hennand L. Harris
2022
Journal of Advances in Modeling Earth Systems

Global atmospheric `storm-resolving' models with horizontal grid spacing of less than 5~km resolve deep cumulus convection and flow in complex terrain. They promise to be reference models that could… 

MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers

Krishna PillutlaSwabha SwayamdiptaRowan ZellersZ. Harchaoui
2021
NeurIPS

As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE , a comparison measure… 

Specializing Multilingual Language Models: An Empirical Study

Ethan C. ChauNoah A. Smith
2021
EMNLP • Workshop on Multilingual Representation Learning

Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance,… 

SciA11y: Converting Scientific Papers to Accessible HTML

Lucy Lu WangIsabel CacholaJonathan BraggDaniel S. Weld
2021
ASSETS

We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML. SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes… 

SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts

Arie CattanSophie JohnsonDaniel S. WeldTom Hope
2021
AKBC

Determining coreference of concept mentions across multiple documents is fundamental for natural language understanding. Work on cross-document coreference resolution (CDCR) typically considers…