Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
High-Precision Extraction of Emerging Concepts from Scientific Literature
Identification of new concepts in scientific literature can help power faceted search, scientific trend analysis, knowledge-base construction, and more, but current methods are lacking. Manual…
Dawn: A high-level domain-specific language compiler toolchain for weather and climate applications
High-level programming languages that allow to express numerical methods and generate efficient parallel implementations are of key importance for the productivity of domain-scientists. The…
oLMpics - On what Language Model Pre-training Captures
Recent success of pre-trained language models (LMs) has spurred widespread interest in the language capabilities that they possess. However, efforts to understand whether LM representations are…
Break It Down: A Question Understanding Benchmark
Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning…
Adversarial Filters of Dataset Biases
Large neural models have demonstrated humanlevel performance on language and vision benchmarks such as ImageNet and Stanford Natural Language Inference (SNLI). Yet, their performance degrades…
Multi-class Hierarchical Question Classification for Multiple Choice Science Exams
Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been…
Transformers as Soft Reasoners over Language
AI has long pursued the goal of having systems reason over explicitly provided knowledge, but building suitable representations has proved challenging. Here we explore whether transformers can…
TransOMCS: From Linguistic Graphs to Commonsense Knowledge
Commonsense knowledge acquisition is a key problem for artificial intelligence. Conventional methods of acquiring commonsense knowledge generally require laborious and costly human annotations,…
CORD-19: The Covid-19 Open Research Dataset
The Covid-19 Open Research Dataset (CORD-19) is a growing 1 resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development…
SUPP. AI: finding evidence for supplement-drug interactions
Dietary supplements are used by a large portion of the population, but information on their pharmacologic interactions is incomplete. To address this challenge, we present this http URL, an…