Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Mixture Content Selection for Diverse Sequence Generation

Jaemin ChoMinjoon SeoHannaneh Hajishirzi

2019

EMNLP

Generating diverse sequences is important in many NLP applications such as question generation or summarization that exhibit semantically one-to-many relationships between source and the target…

PaLM: A Hybrid Parser and Language Model

Hao PengRoy SchwartzNoah A. Smith

2019

EMNLP

We present PaLM, a hybrid parser and neural language model. Building on an RNN language model, PaLM adds an attention layer over text spans in the left context. An unsupervised constituency parser…

Pretrained Language Models for Sequential Sentence Classification

Arman CohanIz BeltagyDaniel KingDaniel S. Weld

2019

EMNLP

As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in…

QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions

Oyvind TafjordMatt GardnerKevin LinPeter Clark

2019

EMNLP

We introduce the first open-domain dataset, called QuaRTz, for reasoning about textual qualitative relationships. QuaRTz contains general qualitative statements, e.g., "A sunscreen with a higher SPF…

Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning

Pradeep DasigiNelson F. LiuAna MarasovicMatt Gardner

2019

EMNLP

Machine comprehension of texts longer than a single sentence often requires coreference resolution. However, most current reading comprehension benchmarks do not contain complex coreferential…

RNN Architecture Learning with Sparse Regularization

Jesse DodgeRoy SchwartzHao PengNoah A. Smith

2019

EMNLP

Neural models for NLP typically use large numbers of parameters to reach state-of-the-art performance, which can lead to excessive memory usage and increased runtime. We present a structure learning…

SciBERT: A Pretrained Language Model for Scientific Text

Iz BeltagyKyle LoArman Cohan

2019

EMNLP

Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin et al., 2018) to…

Show Your Work: Improved Reporting of Experimental Results

Jesse DodgeSuchin GururanganDallas CardNoah A. Smith

2019

EMNLP

Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e.g., accuracy) on held-out test data, compared to previous results. In this…

Social IQA: Commonsense Reasoning about Social Interactions

Maarten SapHannah RashkinDerek ChenYejin Choi

2019

EMNLP

We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social…

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Mandar JoshiDanqi ChenYinhan LiuOmer Levy

2019

EMNLP

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random…

Previous181-190Next