Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks
While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task. We focus on…
Evaluating Gender Bias in Machine Translation
We present the first challenge set and evaluation protocol for the analysis of gender bias in machine translation (MT). Our approach uses two recent coreference resolution datasets composed of…
Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing
Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time.…
ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
Despite recent advances in natural language processing, many statistical models for processing text perform extremely poorly under domain shift. Processing biomedical and clinical text is a…
Barack's Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. However, traditional language models are only capable of remembering facts seen at…
Sentence Mover's Similarity: Automatic Evaluation for Multi-Sentence Texts
For evaluating machine-generated texts, automatic methods hold the promise of avoiding collection of human judgments, which can be expensive and time-consuming. The most common automatic metrics,…
Is Attention Interpretable?
Attention mechanisms have recently boosted performance on a range of NLP tasks. Because attention layers explicitly weight input components' representations, it is also often assumed that attention…
SemEval-2019 Task 10: Math Question Answering
We report on the SemEval 2019 task on math question answering. We provided a question set derived from Math SAT practice exams, including 2778 training questions and 1082 test questions. For a…
Variational Pretraining for Semi-supervised Text Classification
We introduce VAMPIRE, a lightweight pretraining framework for effective text classification when data and computing resources are limited. We pretrain a unigram document model as a variational…
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for the task. However, a large body of work has highlighted the brittleness of these…