Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

CORD-19: The Covid-19 Open Research Dataset

L. Lu WangK. LoY. ChandrasekharS. Kohlmeier
2020
ACL • NLP-COVID

The Covid-19 Open Research Dataset (CORD-19) is a growing 1 resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development… 

SUPP. AI: finding evidence for supplement-drug interactions

Lucy Lu WangOyvind TafjordArman CohanWaleed Ammar
2020
ACL• Demo

Dietary supplements are used by a large portion of the population, but information on their pharmacologic interactions is incomplete. To address this challenge, we present this http URL, an… 

Language (Re)modelling: Towards Embodied Language Understanding

Ronen TamariChen ShaniTom HopeDafna Shahaf
2020
ACL

While natural language understanding (NLU) is advancing rapidly, today’s technology differs from human-like language understanding in fundamental ways, notably in its inferior efficiency,… 

S2ORC: The Semantic Scholar Open Research Corpus

Kyle LoLucy Lu WangMark E NeumannDaniel S. Weld
2020
ACL

We introduce S2ORC, a large contextual citation graph of English-language academic papers from multiple scientific domains; the corpus consists of 81.1M papers, 380.5M citation edges, and associated… 

SciREX: A Challenge Dataset for Document-Level Information Extraction

Sarthak JainMadeleine van ZuylenHannaneh HajishirziIz Beltagy
2020
ACL

Extracting information from full documents is an important problem in many domains, but most previous work focus on identifying relationships within a sentence or a paragraph. It is challenging to… 

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Arman CohanSergey FeldmanIz BeltagyDaniel S. Weld
2020
ACL

Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are… 

Stolen Probability: A Structural Weakness of Neural Language Models

David DemeterGregory KimmelDoug Downey
2020
ACL

Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word… 

TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection

Ellen M. VoorheesTasmeer AlamSteven BedrickLucy Lu Wang
2020
arXiv

TREC-COVID is a community evaluation designed to build a test collection that captures the information needs of biomedical researchers using the scientific literature during a pandemic. One of the… 

TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19

Kirk RobertsTasmeer AlamSteven BedrickWilliam R. Hersh
2020
JAMIA

TREC-COVID is an information retrieval (IR) shared task initiated to support clinicians and clinical research during the COVID-19 pandemic. IR for pandemics breaks many normal assumptions, which can… 

Ranking Significant Discrepancies in Clinical Reports

Sean MacAvaneyArman CohanNazli GoharianRoss Filice
2020
ECIR

Medical errors are a major public health concern and a leading cause of death worldwide. Many healthcare centers and hospitals use reporting systems where medical practitioners write a preliminary…