Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Grounded Situation Recognition

Sarah PrattMark YatskarLuca WeihsAniruddha Kembhavi
2020
ECCV

We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with… 

Spatially Aware Multimodal Transformers for TextVQA

Yash KantDhruv BatraPeter AndersonHarsh Agrawal
2020
ECCV

Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images… 

VisualCOMET: Reasoning About the Dynamic Context of a Still Image

Jae Sung ParkChandra BhagavatulaRoozbeh MottaghiYejin Choi
2020
ECCV

Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat… 

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Mohit ShridharJesse ThomasonDaniel GordonDieter Fox
2020
CVPR

We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions… 

Butterfly Transform: An Efficient FFT Based Neural Architecture Design

Keivan AlizadehAli FarhadiMohammad Rastegari
2020
CVPR

In this paper, we introduce the Butterfly Transform (BFT), a light weight channel fusion method that reduces the computational complexity of point-wise convolutions from O(n^2) of conventional… 

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

Matt DeitkeWinson HanAlvaro HerrastiAli Farhadi
2020
CVPR

Visual recognition ecosystems (e.g. ImageNet, Pascal, COCO) have undeniably played a prevailing role in the evolution of modern computer vision. We argue that interactive and embodied visual AI has… 

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Kiana EhsaniShubham TulsianiSaurabh GuptaAbhinav Gupta
2020
CVPR

When we humans look at a video of human-object interaction, we can not only infer what is happening but we can even extract actionable information and imitate those interactions. On the other hand,… 

Visual Reaction: Learning to Play Catch with Your Drone

Kuo-Hao ZengRoozbeh MottaghiLuca WeihsAli Farhadi
2020
CVPR

In this paper we address the problem of visual reaction: the task of interacting with dynamic environments where the changes in the environment are not necessarily caused by the agents itself.… 

What's Hidden in a Randomly Weighted Neural Network?

Vivek RamanujanMitchell WortsmanAniruddha KembhaviMohammad Rastegari
2020
CVPR

Training a neural network is synonymous with learning the values of the weights. In contrast, we demonstrate that randomly weighted neural networks contain subnetworks which achieve impressive… 

Soft Threshold Weight Reparameterization for Learnable Sparsity

Aditya KusupatiVivek RamanujanRaghav SomaniAli Farhadi
2020
ICML

Sparsity in Deep Neural Networks (DNNs) is studied extensively with the focus of maximizing prediction accuracy given an overall parameter budget. Existing methods rely on uniform or heuristic…