An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

CORD-19: The Covid-19 Open Research Dataset

L. Lu WangK. LoY. ChandrasekharS. Kohlmeier

2020

ACL • NLP-COVID

The Covid-19 Open Research Dataset (CORD-19) is a growing 1 resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development…

SUPP. AI: finding evidence for supplement-drug interactions

Lucy Lu WangOyvind TafjordArman CohanWaleed Ammar

2020

ACL• Demo

Dietary supplements are used by a large portion of the population, but information on their pharmacologic interactions is incomplete. To address this challenge, we present this http URL, an…

Not All Claims are Created Equal: Choosing the Right Approach to Assess Your Hypotheses

Erfan Sadeqi AzerDaniel KhashabiAshish SabharwalDan Roth

2020

ACL

Empirical research in Natural Language Processing (NLP) has adopted a narrow set of principles for assessing hypotheses, relying mainly on p-value computation, which suffers from several known…

Injecting Numerical Reasoning Skills into Language Models

Mor GevaAnkit GuptaJonathan Berant

2020

ACL

Large pre-trained language models (LMs) are known to encode substantial amounts of linguistic information. However, high-level reasoning skills, such as numerical reasoning, are difficult to learn…

Temporal Common Sense Acquisition with Minimal Supervision

Ben ZhouQiang NingDaniel KhashabiDan Roth

2020

ACL

Temporal common sense (e.g., duration and frequency of events) is crucial for understanding natural language. However, its acquisition is challenging, partly because such information is often not…

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Arman CohanSergey FeldmanIz BeltagyDaniel S. Weld

2020

ACL

Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are…

Stolen Probability: A Structural Weakness of Neural Language Models

David DemeterGregory KimmelDoug Downey

2020

ACL

Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word…

SciREX: A Challenge Dataset for Document-Level Information Extraction

Sarthak JainMadeleine van ZuylenHannaneh HajishirziIz Beltagy

2020

ACL

Extracting information from full documents is an important problem in many domains, but most previous work focus on identifying relationships within a sentence or a paragraph. It is challenging to…

S2ORC: The Semantic Scholar Open Research Corpus

Kyle LoLucy Lu WangMark E NeumannDaniel S. Weld

2020

ACL

We introduce S2ORC, a large contextual citation graph of English-language academic papers from multiple scientific domains; the corpus consists of 81.1M papers, 380.5M citation edges, and associated…

Language (Re)modelling: Towards Embodied Language Understanding

Ronen TamariChen ShaniTom HopeDafna Shahaf

2020

ACL

While natural language understanding (NLU) is advancing rapidly, today’s technology differs from human-like language understanding in fundamental ways, notably in its inferior efficiency,…

Previous732-741Next