Papers

See AI2's Award Winning Papers

Learn more about AI2's Lasting Impact Award

Viewing 91-100 of 111 papers

G-CNN: an Iterative Grid Based Object Detector
Mahyar Najibi, Mohammad Rastegari, and Larry DavisCVPR • 2016 We introduce G-CNN, an object detection technique based on CNNs which works without proposal algorithms. G-CNN starts with a multi-scale grid of fixed bounding boxes. We train a regressor to move and scale elements of the grid towards objects iteratively. G…
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav GuptaECCV • 2016 Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to be trained from real and diverse examples of our daily…
Much Ado About Time: Exhaustive Annotation of Temporal Data
Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, and Abhinav GuptaHCOMP • 2016 Large-scale annotated datasets allow AI systems to learn from and build upon the knowledge of the crowd. Many crowdsourcing techniques have been developed for collecting image annotations. These techniques often implicitly rely on the fact that a new input…
Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images
Roozbeh Mottaghi, Hessam Bagherinezhad, Mohammad Rastegari, and Ali FarhadiCVPR • 2016 In this paper, we study the challenging problem of predicting the dynamics of objects in static images. Given a query object in an image, our goal is to provide a physical understanding of the object in terms of the forces acting upon it and its long term…
PDFFigures 2.0: Mining Figures from Research Papers
Christopher Clark and Santosh DivvalaJCDL • 2016 Figures and tables are key sources of information in many scholarly documents. However, current academic search engines do not make use of figures and tables when semantically parsing documents or presenting document summaries to users. To facilitate these…
Situation Recognition: Visual Semantic Role Labeling for Image Understanding
Mark Yatskar, Luke Zettlemoyer, and Ali FarhadiCVPR • 2016 This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e.g., clipping), (2) the participating actors, objects, substances, and locations (e.g., man, shears…
Stating the Obvious: Extracting Visual Common Sense Knowledge
Mark Yatskar, Vicente Ordonez, and Ali FarhadiNAACL • 2016 Obtaining common sense knowledge using current information extraction techniques is extremely challenging. In this work, we instead propose to derive simple common sense statements from fully annotated object detection corpora such as the Microsoft Common…
Toward a Taxonomy and Computational Models of Abnormalities in Images
Babak Saleh, Ahmed Elgammal, Jacob Feldman, and Ali FarhadiAAAI • 2016
Best Student Paper Award
The human visual system can spot an abnormal image, and reason about what makes it strange. This task has not received enough attention in computer vision. In this paper we study various types of atypicalities in images in a more comprehensive way than has…
Unsupervised Deep Embedding for Clustering Analysis
Junyuan Xie, Ross Girshick, and Ali FarhadiICML • 2016 Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this paper, we propose…
"What happens if..." Learning to Predict the Effect of Forces in Images
Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, and Ali FarhadiECCV • 2016 What happens if one pushes a cup sitting on a table toward the edge of the table? How about pushing a desk against a wall? In this paper, we study the problem of understanding the movements of objects as a result of applying external forces to them. For a…

Natural Language Processing

Computer Vision

AI for the Environment

Experimentation and Communication

Research

Research

Papers

G-CNN: an Iterative Grid Based Object Detector

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Much Ado About Time: Exhaustive Annotation of Temporal Data

Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images

PDFFigures 2.0: Mining Figures from Research Papers

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

Stating the Obvious: Extracting Visual Common Sense Knowledge

Toward a Taxonomy and Computational Models of Abnormalities in Images

Unsupervised Deep Embedding for Clustering Analysis

"What happens if..." Learning to Predict the Effect of Forces in Images