Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Visual Semantic Navigation using Scene Priors

Wei YangXiaolong WangAli FarhadiRoozbeh Mottaghi

2019

ICLR

How do humans navigate to target objects in novel scenes? Do we use the semantic/functional priors we have built over years to efficiently search and navigate? For example, to search for mugs, we…

Actor and Observer: Joint Modeling of First and Third-Person Videos

Gunnar SigurdssonCordelia SchmidAli FarhadiKarteek Alahari

2018

CVPR

Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer…

Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering

Aishwarya AgrawalDhruv BatraDevi ParikhAniruddha Kembhavi

2018

CVPR

A number of studies have found that today’s Visual Question Answering (VQA) models are heavily driven by superficial correlations in the training data and lack sufficient image grounding. To…

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

Sachin MehtaMohammad RastegariAnat Caspiand Hannaneh Hajishirzi

2018

ECCV

We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module,…

Imagine This! Scripts to Compositions to Videos

Tanmay GuptaDustin SchwenkAli Farhadiand Aniruddha Kembhavi

2018

ECCV

Imagining a scene described in natural language with realistic layout and appearance of entities is the ultimate test of spatial, visual, and semantic world knowledge. Towards this goal, we present…

IQA: Visual Question Answering in Interactive Environments

Daniel GordonAniruddha KembhaviMohammad RastegariAli Farhadi

2018

CVPR

We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene…

Transferring Common-Sense Knowledge for Object Detection

Krishna Kumar SinghSantosh Kumar DivvalaAli Farhadiand Yong Jae Lee

2018

ECCV

We propose the idea of transferring common-sense knowledge from source categories to target categories for scalable object detection. In our setting, the training data for the source categories have…

Neural Motifs: Scene Graph Parsing with Global Context

Rowan ZellersMark YatskarSam ThomsonYejin Choi

2018

CVPR

We investigate the problem of producing structured graph representations of visual scenes. Our work analyzes the role of motifs: regularly appearing substructures in scene graphs. We present new…

SeGAN: Segmenting and Generating the Invisible

Kiana EhsaniRoozbeh MottaghiAli Farhadi

2018

CVPR

Objects often occlude each other in scenes; Inferring their appearance beyond their visible parts plays an important role in scene understanding, depth estimation, object interaction and…

Structured Set Matching Networks for One-Shot Part Labeling

Jonghyun ChoiJayant KrishnamurthyAniruddha KembhaviAli Farhadi

2018

CVPR

Diagrams often depict complex phenomena and serve as a good test bed for visual and textual reasoning. However, understanding diagrams using natural image understanding approaches requires large…

Previous81-90Next