Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
As major progress is made in open-ended text generation, measuring how close machine-generated text is to human language remains a critical open problem. We introduce MAUVE , a comparison measure…
MERLOT: Multimodal Neural Script Knowledge Models
As humans, we understand events in the visual world contextually, performing multimodal reasoning across time to make inferences about the past, present, and future. We introduce MERLOT, a model…
Natural Adversarial Objects
Although state-of-the-art object detection methods have shown compelling performance, models often are not robust to adversarial attacks and out-of-distribution data. We introduce a new dataset,…
NaturalProofs: Mathematical Theorem Proving in Natural Language
Understanding and creating mathematics using natural mathematical language – the mixture of symbolic and natural language used by humans – is a challenging and important problem for driving progress…
One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval
We present CORA, a Cross-lingual Open-Retrieval Answer Generation model that can answer questions across many languages even when language-specific annotated data or knowledge sources are…
Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing
Explainable NLP (ExNLP) has increasingly focused on collecting human-annotated explanations. These explanations are used downstream in three ways: as data augmentation to improve performance on a…
Bridging the Imitation Gap by Adaptive Insubordination
Why do agents often obtain better reinforcement learning policies when imitating a worse expert? We show that privileged information used by the expert is marginalized in the learned agent policy,…
Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text
Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multimodal gestures (e.g.,…
Specializing Multilingual Language Models: An Empirical Study
Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance,…
Towards Personalized Descriptions of Scientific Concepts
A single scientific concept can be described in many different ways, and the most informative description depends on the audience. In this paper, we propose generating personalized scientific…