Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks

Matthew E. PetersSebastian RuderNoah A. Smith
2019
ACL • RepL4NLP

While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task. We focus on… 

Evaluating Gender Bias in Machine Translation

Gabriel StanovskyNoah A. SmithLuke Zettlemoyer
2019
ACL

We present the first challenge set and evaluation protocol for the analysis of gender bias in machine translation (MT). Our approach uses two recent coreference resolution datasets composed of… 

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

Ben BoginJonathan BerantMatt Gardner
2019
ACL

Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time.… 

ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing

Mark NeumannDaniel KingIz BeltagyWaleed Ammar
2019
ACL • BioNLP Workshop

Despite recent advances in natural language processing, many statistical models for processing text perform extremely poorly under domain shift. Processing biomedical and clinical text is a… 

Barack's Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling

Robert L. Logan IVNelson F. LiuMatthew E. PetersSameer Singh
2019
ACL

Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. However, traditional language models are only capable of remembering facts seen at… 

Sentence Mover's Similarity: Automatic Evaluation for Multi-Sentence Texts

Elizabeth ClarkAsli ÇelikyilmazNoah A. Smith
2019
ACL

For evaluating machine-generated texts, automatic methods hold the promise of avoiding collection of human judgments, which can be expensive and time-consuming. The most common automatic metrics,… 

Is Attention Interpretable?

Sofia SerranoNoah A. Smith
2019
ACL

Attention mechanisms have recently boosted performance on a range of NLP tasks. Because attention layers explicitly weight input components' representations, it is also often assumed that attention… 

SemEval-2019 Task 10: Math Question Answering

Mark HopkinsRonan Le BrasCristian Petrescu-PrahovaRik Koncel-Kedziorski
2019
SemEval

We report on the SemEval 2019 task on math question answering. We provided a question set derived from Math SAT practice exams, including 2778 training questions and 1082 test questions. For a… 

Variational Pretraining for Semi-supervised Text Classification

Suchin GururanganTam DangDallas CardNoah A. Smith
2019
ACL

We introduce VAMPIRE, a lightweight pretraining framework for effective text classification when data and computing resources are limited. We pretrain a unigram document model as a variational… 

DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

Dheeru DuaYizhong WangPradeep DasigiMatt Gardner
2019
NAACL-HLT

Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for the task. However, a large body of work has highlighted the brittleness of these…