Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Who Let The Dogs Out? Modeling Dog Behavior From Visual Data

Kiana EhsaniHessam BagherinezhadJoe RedmonAli Farhadi
2018
CVPR

We study the task of directly modelling a visually intelligent agent. Computer vision typically focuses on solving various subtasks related to visual intelligence. We depart from this standard… 

AI2-THOR: An Interactive 3D Environment for Visual AI

Eric KolveRoozbeh MottaghiWinson HanAli Farhadi
2017
arXiv

We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at this http URL AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate… 

Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension

Aniruddha KembhaviMinjoon SeoDustin Schwenkand Ali Farhadi
2017
CVPR

We introduce the task of Multi-Modal Machine Comprehension (M3C), which aims at answering multimodal questions given a context of text, diagrams and images. We present the Textbook Question… 

Asynchronous Temporal Fields for Action Recognition

Gunnar A SigurdssonSantosh DivvalaAli Farhadiand Abhinav Gupta
2017
CVPR

Actions are more than just movements and trajectories: we cook to eat and we hold a cup to drink from it. A thorough understanding of videos requires going beyond appearance modeling and… 

Bidirectional Attention Flow for Machine Comprehension

Minjoon SeoAniruddha KembhaviAli Farhadiand Hannaneh Hajishirzi
2017
ICLR

Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query. Recently, attention mechanisms have been… 

Commonly Uncommon: Semantic Sparsity in Situation Recognition

Mark YatskarVicente OrdonezLuke Zettlemoyerand Ali Farhadi
2017
CVPR

Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the… 

LCNN: Lookup-based Convolutional Neural Network

Hessam BagherinezhadMohammad Rastegariand Ali Farhadi
2017
CVPR

Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for… 

Query-Reduction Networks for Question Answering

Minjoon SeoSewon MinAli FarhadiHannaneh Hajishirzi
2017
ICLR

In this paper, we study the problem of question answering when reasoning over multiple facts is required. We propose Query-Reduction Network (QRN), a variant of Recurrent Neural Network (RNN) that… 

See the Glass Half Full: Reasoning about Liquid Containers, their Volume and Content

Roozbeh MottaghiConnor SchenckDieter FoxAli Farhadi
2017
ICCV

Humans have rich understanding of liquid containers and their contents; for example, we can effortlessly pour water from a pitcher to a cup. Doing so requires estimating the volume of the cup,… 

Target-driven visual navigation in indoor scenes using deep reinforcement learning

Yuke ZhuRoozbeh MottaghiEric Kolveand Ali Farhadi
2017
ICRA

Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new goals, and (2) data inefficiency, i.e., the model requires several (and often costly)…