Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Pan LuHritik BansalTony XiaJianfeng Gao
2024
ICLR

Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts… 

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari AsaiZeqiu WuYizhong WangHannaneh Hajishirzi
2024
ICLR

Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to their sole reliance on the parametric knowledge they encapsulate.… 

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

Sewon MinSuchin GururanganEric WallaceLuke Zettlemoyer
2024
ICLR

The legality of training language models (LMs) on copyrighted or otherwise restricted data is under intense debate. However, as we show, model performance significantly degrades if trained only on… 

TRAM: Bridging Trust Regions and Sharpness Aware Minimization

Tom SherborneNaomi SaphraPradeep DasigiHao Peng
2024
ICLR

By reducing the curvature of the loss surface in the parameter space, Sharpness-aware minimization (SAM) yields widespread robustness improvement under domain transfer. Instead of focusing on… 

What's In My Big Data?

Yanai ElazarAkshita BhagiaIan MagnussonJesse Dodge
2024
ICLR

Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion… 

A Legal Risk Taxonomy for Generative Artificial Intelligence

David AtkinsonJacob Morrison
2024
arXiv.org

For the first time, this paper presents a taxonomy of legal risks associated with generative AI (GenAI) by breaking down complex legal concepts to provide a common understanding of potential legal… 

Estimating the Causal Effect of Early ArXiving on Paper Acceptance

Yanai ElazarJiayao ZhangDavid WaddenNoah A. Smith
2024
CLearR

What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this… 

A Survey on Data Selection for Language Models

Alon AlbalakYanai ElazarSang Michael XieWilliam Yang Wang
2024
arXiv

A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training. However, naively training a model on all available… 

Calibrating Large Language Models with Sample Consistency

Qing LyuKumar ShridharChaitanya MalaviyaChris Callison-Burch
2024
arXiv

Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application. However, LLMs are often uncalibrated inherently and elude conventional… 

OLMo: Accelerating the Science of Language Models

Dirk GroeneveldIz BeltagyPete WalshHanna Hajishirzi
2024
ACL 2024

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off,…