An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

What's In My Big Data?

Yanai ElazarAkshita BhagiaIan MagnussonJesse Dodge

2024

ICLR

Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion…

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

Shashank GuptaVaishnavi ShrivastavaA. DeshpandeTushar Khot

2024

ICLR

Recent works have showcased the ability of LLMs to embody diverse personas in their responses, exemplified by prompts like 'You are Yoda. Explain the Theory of Relativity.' While this ability allows…

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari AsaiZeqiu WuYizhong WangHannaneh Hajishirzi

2024

ICLR

Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to their sole reliance on the parametric knowledge they encapsulate.…

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Pan LuHritik BansalTony XiaJianfeng Gao

2024

ICLR

Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts…

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

Sewon MinSuchin GururanganEric WallaceLuke Zettlemoyer

2024

ICLR

The legality of training language models (LMs) on copyrighted or otherwise restricted data is under intense debate. However, as we show, model performance significantly degrades if trained only on…

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models

Qingqing CaoSewon MinYizhong WangHannaneh Hajishirzi

2024

ICLR

Retrieval augmentation addresses many critical problems in large language models such as hallucination, staleness, and privacy leaks. However, running retrieval-augmented language models (LMs) is…

WildChat: 1M ChatGPT Interaction Logs in the Wild

Wenting ZhaoXiang RenJ. HesselYuntian Deng

2024

ICLR

Chatbots such as GPT-4 and ChatGPT are now serving millions of users. Despite their widespread use, there remains a lack of public datasets showcasing how these tools are used by a population of…

WildChat: 1M ChatGPT Interaction Logs in the Wild

Wenting ZhaoXiang RenJ. HesselYuntian Deng

2024

ICLR

Chatbots such as GPT-4 and ChatGPT are now serving millions of users. Despite their widespread use, there remains a lack of public datasets showcasing how these tools are used by a population of…

Tailoring Self-Rationalizers with Multi-Reward Distillation

Sahana RamnathBrihi JoshiSkyler HallinanXiang Ren

2024

ICLR

Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant…

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

Faeze BrahmanChandra BhagavatulaValentina PyatkinYejin Choi

2024

ICLR

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense…

Previous92-101Next