Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
You Only Look Once: Unified, Real-Time Object Detection
We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to…
Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images
In this paper, we study the challenging problem of predicting the dynamics of objects in static images. Given a query object in an image, our goal is to provide a physical understanding of the…
Situation Recognition: Visual Semantic Role Labeling for Image Understanding
This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e.g., clipping), (2) the participating…
Question Answering via Integer Programming over Semi-Structured Knowledge
Answering science questions posed in natural language is an important AI challenge. Answering such questions often requires non-trivial inference and knowledge that goes beyond factoid retrieval.…
Probabilistic Models for Learning a Semantic Parser Lexicon
We introduce several probabilistic models for learning the lexicon of a semantic parser. Lexicon learning is the first step of training a semantic parser for a new application domain and the quality…
Stating the Obvious: Extracting Visual Common Sense Knowledge
Obtaining common sense knowledge using current information extraction techniques is extremely challenging. In this work, we instead propose to derive simple common sense statements from fully…
Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects
Human vision greatly benefits from the information about sizes of objects. The role of size in several visual reasoning tasks has been thoroughly explored in human perception and cognition. However,…
Toward a Taxonomy and Computational Models of Abnormalities in Images
The human visual system can spot an abnormal image, and reason about what makes it strange. This task has not received enough attention in computer vision. In this paper we study various types of…
Instructable Intelligent Personal Agent
Unlike traditional machine learning methods, humans often learn from natural language instruction. As users become increasingly accustomed to interacting with mobile devices using speech, their…
Toward Automatic Bootstrapping of Online Communities Using Decision-theoretic Optimization
Successful online communities (e.g., Wikipedia, Yelp, and StackOverflow) can produce valuable content. However, many communities fail in their initial stages. Starting an online community is…