An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Sewon MinKalpesh KrishnaXinxi LyuHannaneh Hajishirzi

2023

EMNLP

Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of…

Machine Reading Comprehension using Case-based Reasoning

Dung Ngoc ThaiDhruv AgarwalMudit ChaudharyA. McCallum

2023

EMNLP

We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC)…

Measuring and Narrowing the Compositionality Gap in Language Models

Ofir PressMuru ZhangSewon MinMike Lewis

2023

EMNLP Findings

We investigate the ability of language models to perform compositional reasoning tasks where the overall solution depends on correctly composing the answers to sub-problems. We measure how often…

SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks

Mohammadreza SalehiSachin MehtaAditya KusupatiHannaneh Hajishirzi

2023

EMNLP

We introduce SHARCS for adaptive inference that takes into account the hardness of input samples. SHARCS can train a router on any transformer network, enabling the model to direct different samples…

TaskWeb: Selecting Better Source Tasks for Multi-task NLP

Joongwon KimAkari AsaiGabriel IlharcoHannaneh Hajishirzi

2023

EMNLP

Recent work in NLP has shown promising results in training models on large amounts of tasks to achieve better generalization. However, it is not well-understood how tasks are related, and how…

Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements

Jiacheng LiuWenya WangDianzhuo WangHanna Hajishirzi

2023

EMNLP

Despite the much discussed capabilities of today's language models, they are still prone to silly and unexpected commonsense failures. We consider a retrospective verification approach that reflects…

We're Afraid Language Models Aren't Modeling Ambiguity

Alisa LiuZhaofeng WuJulian MichaelYejin Choi

2023

EMNLP

Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of human language understanding, allowing us to anticipate misunderstanding as communicators and revise our…

The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback

Nathan LambertRoberto Calandra

2023

arXiv

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) easier to prompt and more capable in complex settings. RLHF at its core is…

Entangled Preferences: The History and Risks of Reinforcement Learning and Human Feedback

Nathan LambertThomas Krendl GilbertTom Zick

2023

arXiv

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) easier to use and more effective. A core piece of the RLHF process is the…

A taxonomy and review of generalization research in NLP

D. HupkesMario GiulianelliVerna DankersZhijing Jin

2023

Nature Machine Intelligence

The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what ‘good generalisation’ entails and how it should be evaluated is not well…

Previous62-71Next