An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

Kolby NottinghamBodhisattwa Prasad MajumderBhavana DalviRoy Fox

2024

ICML

Large language models (LLMs) have recently been used for sequential decision making in interactive environments. However, leveraging environment reward signals for continual LLM actor improvement is…

Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos

Tarun KalluriBodhisattwa Prasad MajumderManmohan Chandraker

2024

ICML

We introduce LaGTran, a novel framework that utilizes text supervision to guide robust transfer of discriminative knowledge from labeled source to unlabeled target data with domain gaps. While…

PDDLEGO: Iterative Planning in Textual Environments

Li ZhangPeter JansenTianyi ZhangNiket Tandon

2024

STARSEM

Planning in textual environments have been shown to be a long-standing challenge even for current models. A recent, promising line of work uses LLMs to generate a formal representation of the…

ADaPT: As-Needed Decomposition and Planning with Language Models

Archiki PrasadAlexander KollerMareike HartmannTushar Khot

2024

NAACL Findings

Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment. Recent works employ LLMs-as-agents in broadly two…

Leveraging Code to Improve In-context Learning for Semantic Parsing

Ben BoginShivanshu GuptaPeter ClarkAshish Sabharwal

2024

NAACL

In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization. However, learning to parse to rare domain-specific languages (DSLs)…

QualEval: Qualitative Evaluation for Model Improvement

Vishvak MurahariAmeet DeshpandePeter ClarkAshwin Kalyan

2024

NAACL

Quantitative evaluation metrics have traditionally been pivotal in gauging the advancements of artificial intelligence systems, including large language models (LLMs). However, these metrics have…

To Tell The Truth: Language of Deception and Language Models

Sanchaita HazraBodhisattwa Prasad Majumder

2024

North American Chapter of the Association for Computational Linguistics

Text-based false information permeates online discourses, yet evidence of people’s ability to discern truth from such deceptive textual content is scarce. We analyze a novel TV game show data where…

OLMES: A Standard for Language Model Evaluations

Yuling GuOyvind TafjordBailey KuehlHanna Hajishirzi

2024

arXiv.org

Progress in AI is often demonstrated by new models claiming improved performance on tasks measuring model capabilities. Evaluating language models in particular is challenging, as small changes to…

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Ruihan YangJiangjie ChenYikai ZhangDeqing Yang

2024

technical report

Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming. However, these agents often face challenges in…

Digital Socrates: Evaluating LLMs through explanation critiques

Yuling GuOyvind TafjordPeter Clark

2024

ACL

While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood. In response, our goal is to define a detailed way of…

Previous41-50Next