Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Task-aware Retrieval with Instructions

Akari AsaiTimo SchickPatrick LewisWen-tau Yih
2023
ACL • Findings

We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware… 

When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

Alex MallenAkari AsaiVictor ZhongHannaneh Hajishirzi
2023
ACL

Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the difficulty of encoding a wealth of world… 

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

Li LucyJesse DodgeDavid BammanKatherine A. Keith
2023
Findings of ACL

Scholarly text is often laden with jargon, or specialized language that can facilitate efficient in-group communication within fields but hinder understanding for out-groups. In this work, we… 

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

Yao FuLitu OuMingyu ChenTushar Khot
2023
ICML 2023, the Challenges in Deployable Generative AI workshop

As large language models (LLMs) are continuously being developed, their evaluation becomes increasingly important yet challenging. This work proposes Chain-of-Thought Hub, an open-source evaluation… 

ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews

Mike D'ArcyAlexis RossErin BransomDoug Downey
2023
arXiv.org

Revising scientific papers based on peer feedback is a challenging task that requires not only deep scientific knowledge and reasoning, but also the ability to recognize the implicit requests in… 

Evaluating the Social Impact of Generative AI Systems in Systems and Society

Irene SolaimanZeerak TalatWilliam AgnewApostol T. Vassilev
2023
arXiv.org

Generative AI systems across modalities, ranging from text, image, audio, and video, have broad social impacts, but there exists no official standard for means of evaluating those impacts and which… 

Morphosyntactic probing of multilingual BERT models

Judit ÁcsEndre HamerlikRoy SchwartzAndrás Kornai
2023
Journal of Natural Language Engineering

We introduce an extensive dataset for multilingual probing of morphological information in language models (247 tasks across 42 languages from 10 families), each consisting of a sentence with a… 

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text

Wanrong ZhuJack HesselAnas AwadallaYejin Choi
2023
arXiv.org

In-context vision and language models like Flamingo support arbitrarily interleaved sequences of images and text as input. This format not only enables few-shot learning via interleaving independent… 

Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

Xinxi LyuSewon MinIz BeltagyHannaneh Hajishirzi
2023
ACL 2023

Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available. In this paper, we introduce Z-ICL, a new… 

Just CHOP: Embarrassingly Simple LLM Compression

Ananya Harsh JhaTom SherborneEvan Pete WalshIz Beltagy
2023
arXiv

Large language models (LLMs) enable unparalleled few- and zero-shot reasoning capabilities but at a high computational footprint. A growing assortment of methods for compression promises to reduce…