An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models

Alexandra ChronopoulouMatthew E. PetersAlexander M. FraserJesse Dodge

2023

Findings of EACL 2023

Pretrained language models (PLMs) are trained on massive corpora, but often need to specialize to specific domains. A parameter-efficient adaptation method suggests training an adapter for each…

Specializing Smaller Language Models towards Multi-Step Reasoning

Yao FuHao PengLitu OuTushar Khot

2023

ICML

The surprising ability of Large Language Models (LLMs) to perform well on complex reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very large-scale models (100+…

Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling

Kolby NottinghamPrithviraj AmmanabroluAlane SuhrRoy Fox

2023

arXiv

Reinforcement learning (RL) agents typically learn tabula rasa, without prior knowledge of the world, which makes learning complex tasks with sparse rewards difﬁcult. If initialized with knowledge…

Does progress on ImageNet transfer to real-world datasets?

Alexander W. FangSimon KornblithLudwig Schmidt

2023

arXiv

Does progress on ImageNet transfer to real-world datasets? We investigate this question by evaluating ImageNet pre-trained models with varying accuracy (57% - 83%) on six practical image…

Reproducible scaling laws for contrastive language-image learning

Mehdi ChertiRomain BeaumontRoss WightmanJ. Jitsev

2022

arXiv

Scaling up neural networks has led to remarkable performance across a wide range of tasks. Moreover, performance often follows reliable scaling laws as a function of training set size, model size,…

Continued Pretraining for Better Zero- and Few-Shot Promptability

Zhaofeng WuRobert L. Logan IVPete WalshIz Beltagy

2022

EMNLP

Recently introduced language model prompting methods can achieve high accuracy in zero-and few-shot settings while requiring few to no learned task-speciﬁc parameters. Never-theless, these methods…

Exploring The Landscape of Distributional Robustness for Question Answering Models

Anas AwadallaMitchell WortsmanGabriel IlharcoLudwig Schmidt

2022

Findings of EMNLP

We conduct a large empirical evaluation to investigate the landscape of distributional robustness in question answering. Our investigation spans over 350 models and 16 question answering datasets,…

Hyperdecoders: Instance-specific decoders for multi-task NLP

Hamish IvisonMatthew E. Peters

2022

Findings of EMNLP

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efﬁcient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder. This…

CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

Abhilasha RavichanderMatt GardnerAna Marasović

2022

EMNLP

The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for…

GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation

Daniel KhashabiGabriel StanovskyJonathan BraggDaniel S. Weld

2022

EMNLP

While often assumed a gold standard, effective human evaluation of text generation remains an important, open area for research. We revisit this problem with a focus on pro-ducing consistent…

Previous111-120Next