Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

ADaPT: As-Needed Decomposition and Planning with Language Models

Archiki PrasadAlexander KollerMareike HartmannTushar Khot
2024
NAACL Findings

Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment. Recent works employ LLMs-as-agents in broadly two… 

Leveraging Code to Improve In-context Learning for Semantic Parsing

Ben BoginShivanshu GuptaPeter ClarkAshish Sabharwal
2024
NAACL

In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization. However, learning to parse to rare domain-specific languages (DSLs)… 

QualEval: Qualitative Evaluation for Model Improvement

Vishvak MurahariAmeet DeshpandePeter ClarkAshwin Kalyan
2024
NAACL

Quantitative evaluation metrics have traditionally been pivotal in gauging the advancements of artificial intelligence systems, including large language models (LLMs). However, these metrics have… 

To Tell The Truth: Language of Deception and Language Models

Sanchaita HazraBodhisattwa Prasad Majumder
2024
North American Chapter of the Association for Computational Linguistics

Text-based false information permeates online discourses, yet evidence of people’s ability to discern truth from such deceptive textual content is scarce. We analyze a novel TV game show data where… 

OLMES: A Standard for Language Model Evaluations

Yuling GuOyvind TafjordBailey KuehlHanna Hajishirzi
2024
arXiv.org

Progress in AI is often demonstrated by new models claiming improved performance on tasks measuring model capabilities. Evaluating language models in particular is challenging, as small changes to… 

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Ruihan YangJiangjie ChenYikai ZhangDeqing Yang
2024
technical report

Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming. However, these agents often face challenges in… 

Digital Socrates: Evaluating LLMs through explanation critiques

Yuling GuOyvind TafjordPeter Clark
2024
ACL

While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood. In response, our goal is to define a detailed way of… 

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

Shashank GuptaVaishnavi ShrivastavaA. DeshpandeTushar Khot
2024
ICLR

Recent works have showcased the ability of LLMs to embody diverse personas in their responses, exemplified by prompts like 'You are Yoda. Explain the Theory of Relativity.' While this ability allows… 

The Expressive Power of Transformers with Chain of Thought

William MerrillAshish Sabharwal
2024
ICLR

Recent theoretical work has identified surprisingly simple reasoning problems, such as checking if two nodes in a graph are connected or simulating finite-state machines, that are provably… 

Closing the Curious Case of Neural Text Degeneration

Matthew FinlaysonJohn HewittAlexander KollerAshish Sabharwal
2024
ICLR

Despite their ubiquity in language generation, it remains unknown why truncation sampling heuristics like nucleus sampling are so effective. We provide a theoretical explanation for the…