An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

A Dataset for N-ary Relation Extraction of Drug Combinations

Aryeh TiktinskyVijay ViswanathanDanna NiezniYoav Goldberg

2022

NAACL

Combination therapies have become the standard of care for diseases such as cancer, tuberculosis, malaria and HIV. However, the combinatorial set of available multi-drug treatments creates a…

Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

Maarten SapSwabha SwayamdiptaLaura ViannaNoah A. Smith

2022

NAACL

Warning : this paper discusses and contains content that is offensive or upsetting. The perceived toxicity of language can vary based on someone’s identity and beliefs, but this variation is often…

Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

Jungo KasaiKeisuke SakaguchiRonan Le BrasNoah A. Smith

2022

NAACL

Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of…

Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer

Yanpeng ZhaoJack HesselYoungjae YuYejin Choi

2022

NAACL

Machines that can represent and describe environmental soundscapes have practical poten-tial, e.g., for audio tagging and captioning. Pre-vailing learning paradigms of audio-text connections have…

DEMix Layers: Disentangling Domains for Modular Language Modeling

Suchin GururanganMichael LewisAri HoltzmanLuke Zettlemoyer

2022

NAACL

We introduce a new domain expert mixture (DEMIX) layer that enables conditioning a language model (LM) on the domain of the input text. A DEMIX layer is a collection of expert feedforward networks,…

Efficient Hierarchical Domain Adaptation for Pretrained Language Models

Alexandra ChronopoulouMatthew E. PetersJesse Dodge

2022

NAACL

The remarkable success of large language models has been driven by dense models trained on massive unlabeled, unstructured corpora. These corpora typically contain text from diverse, heterogeneous…

Few-Shot Self-Rationalization with Natural Language Prompts

Ana MarasovićIz BeltagyDoug DowneyMatthew E. Peters

2022

Findings of NAACL

Self-rationalization models that predict task labels and generate free-text elaborations for their predictions could enable more intuitive interaction with NLP systems. These models are, however,…

Literature-Augmented Clinical Outcome Prediction

Aakanksha NaikS. ParasaSergey FeldmanTom Hope

2022

Findings of NAACL

We present BEEP (Biomedical Evidence-Enhanced Predictions), a novel approach for clinical outcome prediction that retrieves patient-specific medical literature and incorporates it into predictive…

Long Context Question Answering via Supervised Contrastive Learning

Avi CaciularuIdo DaganJacob GoldbergerArman Cohan

2022

NAACL

Long-context question answering (QA) tasks require reasoning over a long document or multiple documents. Addressing these tasks often beneﬁts from identifying a set of evidence spans (e.g.,…

MultiVerS: Improving scientific claim verification with weak supervision and full-document context

David WaddenKyle LoLucy Lu WangHannaneh Hajishirzi

2022

Findings of NAACL

The scientific claim verification task requires an NLP system to label scientific documents which Support or Refute an input claim, and to select evidentiary sentences (or rationales) justifying…

Previous372-381Next