Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Document-Level Definition Detection in Scholarly Documents: Existing Models, Error Analyses, and Future Directions

Dongyeop KangAndrew HeadRisham SidhuMarti A. Hearst
2020
EMNLP • SDP workshop

The task of definition detection is important for scholarly papers, because papers often make use of technical terminology that may be unfamiliar to readers. Despite prior work on definition… 

The Extraordinary Failure of Complement Coercion Crowdsourcing

Yanai ElazarVictoria BasmovShauli RavfogelReut Tsarfaty
2020
EMNLP • Insights from Negative Results in NLP Workshop

Crowdsourcing has eased and scaled up the collection of linguistic annotation in recent years. In this work, we follow known methodologies of collecting labeled data for the complement coercion… 

A Simple Yet Strong Pipeline for HotpotQA

Dirk GroeneveldTushar KhotMausamAshish Sabharwal
2020
EMNLP

State-of-the-art models for multi-hop question answering typically augment large-scale language models like BERT with additional, intuitively useful capabilities such as named entity recognition,… 

UnifiedQA: Crossing Format Boundaries With a Single QA System

Daniel KhashabiSewon MinTushar KhotHannaneh Hajishirzi
2020
Findings of EMNLP

Question answering (QA) tasks have been posed using a variety of formats, such as extractive span selection, multiple choice, etc. This has led to format-specialized models, and even to an implicit… 

Fact or Fiction: Verifying Scientific Claims

David WaddenKyle LoLucy Lu WangHannaneh Hajishirzi
2020
EMNLP

We introduce the task of scientific fact-checking. Given a corpus of scientific articles and a claim about a scientific finding, a fact-checking model must identify abstracts that support or refute… 

TLDR: Extreme Summarization of Scientific Documents

Isabel CacholaKyle LoArman CohanDaniel S. Weld
2020
Findings of EMNLP

We introduce TLDR generation for scientific papers, a new automatic summarization task with high source compression, requiring expert background knowledge and complex language understanding. To… 

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

Tom HopeJason PortenoyKishore VasanJevin D. West
2020
EMNLP • Demo

The COVID-19 pandemic has sparked unprecedented mobilization of scientists, already generating thousands of new papers that join a litany of previous biomedical work in related areas. This deluge of… 

"You are grounded!": Latent Name Artifacts in Pre-trained Language Models

Vered ShwartzRachel RudingerOyvind Tafjord
2020
EMNLP

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models. We focus on artifacts associated with the representation of given names (e.g.,… 

What-if I ask you to explain: Explaining the effects of perturbations in procedural text

Dheeraj RajagopalNiket TandonPeter ClarkEduard H. Hovy
2020
Findings of EMNLP

We address the task of explaining the effects of perturbations in procedural text, an important test of process comprehension. Consider a passage describing a rabbit's life-cycle: humans can easily… 

Unsupervised Commonsense Question Answering with Self-Talk

Vered ShwartzPeter WestRonan Le BrasYejin Choi
2020
EMNLP

Natural language understanding involves reading between the lines with implicit background knowledge. Current systems either rely on pre-trained language models as the sole implicit source of world…