Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
A Lightweight and High Performance Monolingual Word Aligner
Fast alignment is essential for many natural language tasks. But in the setting of monolingual alignment, previous work has not been able to align more than one sentence pair per second. We describe…
Automatic Coupling of Answer Extraction and Information Retrieval
Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily…
A Study of the Knowledge Base Requirements for Passing an Elementary Science Test
Our long-term interest is in machines that contain large amounts of general and scientific knowledge, stored in a "computable" form that supports reasoning and explanation. As a medium-term focus…
Learning Biological Processes with Global Constraints
Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological…
Extracting Meronyms for a Biology Knowledge Base Using Distant Supervision
Knowledge of objects and their parts, meronym relations, are at the heart of many question-answering systems, but manually encoding these facts is impractical. Past researchers have tried…
Semi-Markov Phrase-based Monolingual Alignment
We introduce a novel discriminative model for phrase-based monolingual alignment using a semi-Markov CRF. Our model achieves stateof-the-art alignment accuracy on two phrasebased alignment datasets…
Answer Extraction as Sequence Tagging with Tree Edit Distance
Our goal is to extract answers from preretrieved sentences for Question Answering (QA). We construct a linear-chain Conditional Random Field based on pairs of questions and their possible answer…
Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents
It is argued that suitably trained neural language models exhibit key properties of epistemic agency: they hold probabilistically coherent and logically consistent degrees of belief, which they can…
Constructing a Textual KB from a Biology TextBook
As part of our work on building a "knowledgeable textbook" about biology, we are developing a textual question-answering (QA) system that can answer certain classes of biology questions posed by…
Finding Deceptive Opinion Spam by Any Stretch of the Imagination
Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While…