Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

FLEX: Unifying Evaluation for Few-Shot NLP

Jonathan BraggArman CohanKyle LoIz Beltagy
2021
NeurIPS

Few-shot NLP research is highly active, yet conducted in disjoint research threads with evaluation suites that lack challenging-yet-realistic testing setups and fail to employ careful experimental… 

Towards Personalized Descriptions of Scientific Concepts

Sonia K. MurthyDaniel KingTom HopeDoug Downey
2021
EMNLP 2021 • WiNLP

A single scientific concept can be described in many different ways, and the most informative description depends on the audience. In this paper, we propose generating personalized scientific… 

CDLM: Cross-Document Language Modeling

Avi CaciularuArman CohanIz BeltagyIdo Dagan
2021
Findings of EMNLP

We introduce a new pretraining approach for language models that are geared to support multi-document NLP tasks. Our crossdocument language model (CD-LM) improves masked language modeling for these… 

MS2: Multi-Document Summarization of Medical Studies

Jay DeYoungIz BeltagyMadeleine van ZuylenLucy Lu Wang
2021
EMNLP

To assess the effectiveness of any medical intervention, researchers must conduct a timeintensive and highly manual literature review. NLP systems can help to automate or assist in parts of this… 

SciA11y: Converting Scientific Papers to Accessible HTML

Lucy Lu WangIsabel CacholaJonathan BraggDaniel S. Weld
2021
ASSETS

We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML. SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes… 

SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts

Arie CattanSophie JohnsonDaniel S. WeldTom Hope
2021
AKBC

Determining coreference of concept mentions across multiple documents is fundamental for natural language understanding. Work on cross-document coreference resolution (CDCR) typically considers… 

Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study

Rahul NadkarniDavid WaddenIz BeltagyTom Hope
2021
AKBC

Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes. Predicting missing links in these graphs can boost many important applications, such as drug… 

S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

Shivashankar SubramanianDaniel KingDoug DowneySergey Feldman
2021
JCDL

Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library… 

Explaining Relationships Between Scientific Documents

Kelvin LuuXinyi WuRik Koncel-KedziorskiNoah A. Smit
2021
ACL

We address the task of explaining relationships between two scientific documents using natural language text. This task requires modeling the complex content of long technical documents, deducing a… 

PAWLS: PDF Annotation With Labels and Structure

Mark NeumannZejiang ShenSam Skjonsberg
2021
Demo • ACL

Adobe’s Portable Document Format (PDF) is a popular way of distributing view-only documents with a rich visual markup. This presents a challenge to NLP practitioners who wish to use the information…