An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

TuringAdvice: A Generative and Dynamic Evaluation of Language Use

Rowan ZellersAri HoltzmanElizabeth ClarkYejin Choi

2020

NAACL

We propose TuringAdvice, a new challenge task and dataset for language understanding models. Given a written situation that a real person is currently facing, a model must generate helpful advice in…

Evaluating NLP Models via Contrast Sets

M.GardnerY.ArtziV.Basmovaet.al

2020

arXiv

Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading:…

Soft Threshold Weight Reparameterization for Learnable Sparsity

Aditya KusupatiVivek RamanujanRaghav SomaniAli Farhadi

2020

ICML

Sparsity in Deep Neural Networks (DNNs) is studied extensively with the focus of maximizing prediction accuracy given an overall parameter budget. Existing methods rely on uniform or heuristic…

Differentiable Scene Graphs

Moshiko RabohRoei HerzigGal ChechikAmir Globerson

2020

WACV

Understanding the semantics of complex visual scenes involves perception of entities and reasoning about their relations. Scene graphs provide a natural representation for these tasks, by assigning…

Multi-View Learning for Vision-and-Language Navigation

Qiaolin XiaXiujun LiChunyuan LiNoah A. Smith

2020

arXiv

Learning to navigate in a visual environment following natural language instructions is a challenging task because natural language instructions are highly variable, ambiguous, and under-specified.…

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

Jesse DodgeGabriel IlharcoRoy SchwartzNoah A. Smith

2020

arXiv

Fine-tuning pretrained contextual word embedding models to supervised downstream tasks has become commonplace in natural language processing. This process, however, is often brittle: even with the…

WinoGrande: An Adversarial Winograd Schema Challenge at Scale

Keisuke SakaguchiRonan Le BrasChandra BhagavatulaYejin Choi

2020

AAAI

The Winograd Schema Challenge (WSC), proposed by Levesque et al. (2011) as an alternative to the Turing Test, was originally designed as a pronoun resolution problem that cannot be solved based on…

PIQA: Reasoning about Physical Commonsense in Natural Language

Yonatan BiskRowan ZellersRonan Le BrasYejin Choi

2020

AAAI

To apply eyeshadow without a brush, should I use a cotton swab or a toothpick? Questions requiring this kind of physical commonsense pose a challenge to today's natural language understanding…

Probing Natural Language Inference Models through Semantic Fragments

Kyle RichardsonHai Na HuLawrence S. MossAshish Sabharwal

2020

AAAI

Do state-of-the-art models for language understanding already have, or can they easily learn, abilities such as boolean coordination, quantification, conditionals, comparatives, and monotonicity…

MonaLog: a Lightweight System for Natural Language Inference Based on Monotonicity

Hai HuQi ChenKyle RichardsonSandra Kübler

2020

SCIL

We present a new logic-based inference engine for natural language inference (NLI) called MonaLog, which is based on natural logic and the monotonicity calculus. In contrast to existing logic-based…

Previous811-820Next