Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Counterfactual Story Reasoning and Generation
Counterfactual reasoning requires predicting how alternative events, contrary to what actually happened, might have resulted in different outcomes. Despite being considered a necessary component of…
A Discrete Hard EM Approach for Weakly Supervised Question Answering
Many question answering (QA) tasks only provide weak supervision for how the answer should be computed. For example, TriviaQA answers are entities that can be mentioned multiple times in supporting…
Entity, Relation, and Event Extraction with Contextualized Span Representations
We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. Our framework (called…
Mixture Content Selection for Diverse Sequence Generation
Generating diverse sequences is important in many NLP applications such as question generation or summarization that exhibit semantically one-to-many relationships between source and the target…
BERT for Coreference Resolution: Baselines and Analysis
We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared…
“Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding
Understanding time is crucial for understanding events expressed in natural language. Because people rarely say the obvious, it is often necessary to have commonsense knowledge about various…
Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text
Our goal is to better comprehend procedural text, e.g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others. Our approach builds…
Transfer Learning Between Related Tasks Using Expected Label Proportions
Deep learning systems thrive on abundance of labeled training data but such data is not always available, calling for alternative methods of supervision. One such method is expectation…
Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training
We focus on the problem of language modeling for code-switched language, in the context of automatic speech recognition (ASR). Language modeling for code-switched language is challenging for (at…
SpanBERT: Improving Pre-training by Representing and Predicting Spans
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random…