Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Grounded Situation Recognition

Sarah PrattMark YatskarLuca WeihsAniruddha Kembhavi

2020

ECCV

We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with…

Spatially Aware Multimodal Transformers for TextVQA

Yash KantDhruv BatraPeter AndersonHarsh Agrawal

2020

ECCV

Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images…

VisualCOMET: Reasoning About the Dynamic Context of a Still Image

Jae Sung ParkChandra BhagavatulaRoozbeh MottaghiYejin Choi

2020

ECCV

Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat…

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Mohit ShridharJesse ThomasonDaniel GordonDieter Fox

2020

CVPR

We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions…

Butterfly Transform: An Efficient FFT Based Neural Architecture Design

Keivan AlizadehAli FarhadiMohammad Rastegari

2020

CVPR

In this paper, we introduce the Butterfly Transform (BFT), a light weight channel fusion method that reduces the computational complexity of point-wise convolutions from O(n^2) of conventional…

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

Matt DeitkeWinson HanAlvaro HerrastiAli Farhadi

2020

CVPR

Visual recognition ecosystems (e.g. ImageNet, Pascal, COCO) have undeniably played a prevailing role in the evolution of modern computer vision. We argue that interactive and embodied visual AI has…

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Kiana EhsaniShubham TulsianiSaurabh GuptaAbhinav Gupta

2020

CVPR

When we humans look at a video of human-object interaction, we can not only infer what is happening but we can even extract actionable information and imitate those interactions. On the other hand,…

Visual Reaction: Learning to Play Catch with Your Drone

Kuo-Hao ZengRoozbeh MottaghiLuca WeihsAli Farhadi

2020

CVPR

In this paper we address the problem of visual reaction: the task of interacting with dynamic environments where the changes in the environment are not necessarily caused by the agents itself.…

What's Hidden in a Randomly Weighted Neural Network?

Vivek RamanujanMitchell WortsmanAniruddha KembhaviMohammad Rastegari

2020

CVPR

Training a neural network is synonymous with learning the values of the weights. In contrast, we demonstrate that randomly weighted neural networks contain subnetworks which achieve impressive…

Soft Threshold Weight Reparameterization for Learnable Sparsity

Aditya KusupatiVivek RamanujanRaghav SomaniAli Farhadi

2020

ICML

Sparsity in Deep Neural Networks (DNNs) is studied extensively with the focus of maximizing prediction accuracy given an overall parameter budget. Existing methods rely on uniform or heuristic…

Previous61-70Next