Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
A Data Scientist's Guide to Start-Ups
In August 2013, we held a panel discussion at the KDD 2013 conference in Chicago on the subject of data science, data scientists, and start-ups. KDD is the premier conference on data science…
Open Question Answering Over Curated and Extracted Knowledge Bases
We consider the problem of open-domain question answering (Open QA) over massive knowledge bases (KBs). Existing approaches use either manually curated KBs like Freebase or KBs automatically…
Freebase QA: Information Extraction or Semantic Parsing?
We contrast two seemingly distinct approaches to the task of question answering (QA) using Freebase: one based on information extraction techniques, the other on semantic parsing. Results over the…
Learning Everything about Anything: Webly-Supervised Visual Concept Learning
Recognition is graduating from labs to real-world applications. While it is encouraging to see its potential being tapped, it brings forth a fundamental challenge to the vision researcher:…
Chinese Open Relation Extraction for Knowledge Acquisition
This study presents the Chinese Open Relation Extraction (CORE) system that is able to extract entity-relation triples from Chinese free texts based on a series of NLP techniques, i.e., word…
Discourse Complements Lexical Semantics for Non-factoid Answer Reranking
We propose a robust answer reranking model for non-factoid questions that integrates lexical semantics with discourse information, driven by two representations of discourse: a shallow…
Diagram Understanding in Geometry Questions
Automatically solving geometry questions is a longstanding AI problem. A geometry question typically includes a textual description accompanied by a diagram. The first step in solving geometry…
A Lightweight and High Performance Monolingual Word Aligner
Fast alignment is essential for many natural language tasks. But in the setting of monolingual alignment, previous work has not been able to align more than one sentence pair per second. We describe…
Automatic Coupling of Answer Extraction and Information Retrieval
Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily…
A Study of the Knowledge Base Requirements for Passing an Elementary Science Test
Our long-term interest is in machines that contain large amounts of general and scientific knowledge, stored in a "computable" form that supports reasoning and explanation. As a medium-term focus…