An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

OLMES: A Standard for Language Model Evaluations

Yuling GuOyvind TafjordBailey KuehlHanna Hajishirzi

2024

arXiv.org

Progress in AI is often demonstrated by new models claiming improved performance on tasks measuring model capabilities. Evaluating language models in particular is challenging, as small changes to…

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models

Qingqing CaoSewon MinYizhong WangHannaneh Hajishirzi

2024

ICLR

Retrieval augmentation addresses many critical problems in large language models such as hallucination, staleness, and privacy leaks. However, running retrieval-augmented language models (LMs) is…

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Pan LuHritik BansalTony XiaJianfeng Gao

2024

ICLR

Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts…

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari AsaiZeqiu WuYizhong WangHannaneh Hajishirzi

2024

ICLR

Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to their sole reliance on the parametric knowledge they encapsulate.…

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

Sewon MinSuchin GururanganEric WallaceLuke Zettlemoyer

2024

ICLR

The legality of training language models (LMs) on copyrighted or otherwise restricted data is under intense debate. However, as we show, model performance significantly degrades if trained only on…

TRAM: Bridging Trust Regions and Sharpness Aware Minimization

Tom SherborneNaomi SaphraPradeep DasigiHao Peng

2024

ICLR

By reducing the curvature of the loss surface in the parameter space, Sharpness-aware minimization (SAM) yields widespread robustness improvement under domain transfer. Instead of focusing on…

What's In My Big Data?

Yanai ElazarAkshita BhagiaIan MagnussonJesse Dodge

2024

ICLR

Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion…

A Legal Risk Taxonomy for Generative Artificial Intelligence

David AtkinsonJacob Morrison

2024

arXiv.org

For the first time, this paper presents a taxonomy of legal risks associated with generative AI (GenAI) by breaking down complex legal concepts to provide a common understanding of potential legal…

Estimating the Causal Effect of Early ArXiving on Paper Acceptance

Yanai ElazarJiayao ZhangDavid WaddenNoah A. Smith

2024

CLearR

What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this…

A Survey on Data Selection for Language Models

Alon AlbalakYanai ElazarSang Michael XieWilliam Yang Wang

2024

arXiv

A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training. However, naively training a model on all available…

Previous42-51Next