Skip to main content ->

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

On-the-fly Definition Augmentation of LLMs for Biomedical NER

Monica MunnangiSergey FeldmanByron C WallaceAakanksha Naik
NAACL 2024

Despite their general capabilities, LLMs still struggle on biomedical NER tasks, which are difficult due to the presence of specialized terminology and lack of training data. In this work we set out… 

Personalized Jargon Identification for Enhanced Interdisciplinary Communication

Yue GuoJoseph Chee ChangMaria AntoniakTal August

Scientific jargon can impede researchers when they read materials from other domains. Current methods of jargon identification mainly use corpus-level familiarity indicators (e.g., Simple Wikipedia… 

A Design Space for Intelligent and Interactive Writing Assistants

Mina LeeKaty Ilonka GeroJohn Joon Young ChungPao Siangliulue

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge… 

Mitigating Barriers to Public Social Interaction with Meronymous Communication

Nouran SolimanHyeonsu B. KangMatthew LatzkeDavid R. Karger

In communities with social hierarchies, fear of judgment can discourage communication. While anonymity may alleviate some social pressure, fully anonymous spaces enable toxic behavior and hide the… 

PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers

Yoonjoo LeeHyeonsu B KangMatt LatzkePao Siangliulue

With the rapid growth of scholarly archives, researchers subscribe to"paper alert"systems that periodically provide them with recommendations of recently published papers that are similar to… 

CARE: Extracting Experimental Findings From Clinical Literature

Aakanksha NaikBailey KuehlErin BransomTom Hope
NAACL 2024

Extracting fine-grained experimental findings from literature can provide dramatic utility for scientific applications. Prior work has developed annotation schemas and datasets for limited aspects… 

Estimating the Causal Effect of Early ArXiving on Paper Acceptance

Yanai ElazarJiayao ZhangDavid WaddenNoah A. Smith

What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this… 

FigurA11y: AI Assistance for Writing Scientific Alt Text

Nikhil SinghLucy Lu WangJonathan Bragg

High-quality alt text is crucial for making scientific figures accessible to blind and low-vision readers. Crafting complete, accurate alt text is challenging even for domain experts, as published… 

OLMo: Accelerating the Science of Language Models

Dirk GroeneveldIz BeltagyPete WalshHanna Hajishirzi
ACL 2024

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off,… 

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Luca SoldainiRodney KinneyAkshita BhagiaKyle Lo
ACL 2024

Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often… 
