Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Topics to Avoid: Demoting Latent Confounds in Text Classification
Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize…
Transfer Learning Between Related Tasks Using Expected Label Proportions
Deep learning systems thrive on abundance of labeled training data but such data is not always available, calling for alternative methods of supervision. One such method is expectation…
Universal Adversarial Triggers for Attacking and Analyzing NLP
dversarial examples highlight model vulnerabilities and are useful for evaluation and interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens that trigger a…
WIQA: A dataset for "What if..." reasoning over procedural text
We introduce WIQA, the first large-scale dataset of "What if..." questions over procedural text. WIQA contains three parts: a collection of paragraphs each describing a process, e.g., beach erosion;…
Y'all should read this! Identifying Plurality in Second-Person Personal Pronouns in English Texts
Distinguishing between singular and plural "you" in English is a challenging task which has potential for downstream applications, such as machine translation or coreference resolution. While formal…
Robust Navigation with Language Pretraining and Stochastic Sampling
Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and…
Adversarial Removal of Demographic Attributes from Text Data
Recent advances in Representation Learning and Adversarial Training seem to succeed in removing unwanted features from the learned representation. We show that demographic information of authors is…
Bridging Knowledge Gaps in Neural Entailment via Symbolic Models
Most textual entailment models focus on lexical gaps between the premise text and the hypothesis, but rarely on knowledge gaps. We focus on filling these knowledge gaps in the Science Entailment…
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
We present a new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject. The open book that comes with our questions is a set of…
Can LSTM Learn to Capture Agreement? The Case of Basque
Sequential neural networks models are powerful tools in a variety of Natural Language Processing (NLP) tasks. The sequential nature of these models raises the questions: to what extent can these…