Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback

Nathan LambertRoberto Calandra
2023
arXiv

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) easier to prompt and more capable in complex settings. RLHF at its core is… 

Papeos: Augmenting Research Papers with Talk Videos

Tae Soo KimMatt LatzkeJonathan BraggJoseph Chee Chang
2023
UIST

Research consumption has been traditionally limited to the reading of academic papers—a static, dense, and formally written format. Alternatively, pre-recorded conference presentation videos, which… 

Synergi: A Mixed-Initiative System for Scholarly Synthesis and Sensemaking

Hyeonsu B KangSherry WuJoseph Chee ChangA. Kittur
2023
UIST

Efficiently reviewing scholarly literature and synthesizing prior art are crucial for scientific progress. Yet, the growing scale of publications and the burden of knowledge make synthesis of… 

Entangled Preferences: The History and Risks of Reinforcement Learning and Human Feedback

Nathan LambertThomas Krendl GilbertTom Zick
2023
arXiv

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique to make large language models (LLMs) easier to use and more effective. A core piece of the RLHF process is the… 

A taxonomy and review of generalization research in NLP

D. HupkesMario GiulianelliVerna DankersZhijing Jin
2023
Nature Machine Intelligence

The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what ‘good generalisation’ entails and how it should be evaluated is not well… 

Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena

Jiangjie ChenSiyu YuanRong YeKyle RichardsonKyle Richardson
2023
arXiv

Can Large Language Models (LLMs) simulate human behavior in complex environments? LLMs have recently been shown to exhibit advanced reasoning skills but much of NLP evaluation still relies on static… 

SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding

Favyen BastaniPiper WoltersRitwik GuptaAniruddha Kembhavi
2023
ICCV

Remote sensing images are useful for a wide variety of planet monitoring applications, from tracking deforestation to tackling illegal fishing. The Earth is extremely diverse -- the amount of… 

Making Retrieval-Augmented Language Models Robust to Irrelevant Context

Ori YoranTomer WolfsonOri RamJonathan Berant
2023
ICLR

Retrieval-augmented language models (RALMs) hold promise to produce language understanding systems that are are factual, efficient, and up-to-date. An important desideratum of RALMs, is that… 

The Surveillance AI Pipeline

Pratyusha Ria KalluriWilliam AgnewM. ChengA. Birhane
2023
arXiv

A rapidly growing number of voices have argued that AI research, and computer vision in particular, is closely tied to mass surveillance. Yet the direct path from computer vision research to… 

When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets

Orion WellerKyle LoDavid WaddenLuca Soldaini
2023
arXiv

Using large language models (LMs) for query or document expansion can improve generalization in information retrieval. However, it is unknown whether these techniques are universally beneficial or…