An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Misinfo Reaction Frames: Reasoning about Readers' Reactions to News Headlines (preprint)

Saadia GabrielSkyler HallinanMaarten SapYejin Choi

2021

ACL

Even to a simple and short news headline, readers react in a multitude of ways: cognitively (e.g., inferring the writer's intent), emotionally (e.g., feeling distrust), and behaviorally (e.g.,…

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

Jack HesselAri HoltzmanMaxwell ForbesYejin Choi

2021

EMNLP

Image captioning has conventionally relied on reference-based automatic evaluations, where machine captions are compared against captions written by humans. This is in contrast to the reference-free…

GridToPix: Training Embodied Agents with Minimal Supervision

Unnat JainIou-Jen LiuS. LazebnikA. Schwing

2021

ICCV

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped…

Contrasting Contrastive Self-Supervised Representation Learning Pipelines

Klemen KotarGabriel IlharcoLudwig SchmidtR. Mottaghi

2021

IEEE/CVF International Conference on Computer Vision (ICCV)

In the past few years, we have witnessed remarkable breakthroughs in self-supervised representation learning. Despite the success and adoption of representations learned through this paradigm, much…

“I’m Not Mad”: Commonsense Implications of Negation and Contradiction

Liwei JiangAntoine BosselutChandra BhagavatulaYejin Choi

2021

NAACL

Natural language inference requires reasoning about contradictions, negations, and their commonsense implications. Given a simple premise (e.g., “I’m mad at you”), humans can reason about the…

Learning Curves for Analysis of Deep Networks

Derek HoiemTanmay GuptaZhizhong LiMichal Shlapentokh-Rothman

2021

arXiv

A learning curve models a classifier's test error as a function of the number of training samples. Prior works show that learning curves can be used to select model parameters and extrapolate…

Visual Semantic Role Labeling for Video Understanding

Arka SadhuTanmay GuptaMark YatskarAniruddha Kembhavi

2021

CVPR

We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. We represent videos as a set of related events, wherein each…

Visual Room Rearrangement

Luca WeihsMatt DeitkeAniruddha KembhaviR. Mottaghi

2021

arXiv

There has been a significant recent progress in the field of Embodied AI with researchers developing models and algorithms enabling embodied agents to navigate and interact within completely unseen…

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis

Zejiang ShenRuochen ZhangMelissa DellWeining Li

2021

arXiv

Recent advances in document image analysis (DIA) have been primarily driven by the application of neural networks. Ideally, research outcomes could be easily deployed in production and extended for…

Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2

G. BetzKyle RichardsonChristian Voigt

2021

arXiv

Thinking aloud is an effective meta-cognitive strategy human reasoners apply to solve difficult problems. We suggest to improve the reasoning ability of pre-trained neural language models in a…

Previous612-621Next