Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Editing Common Sense in Transformers

Anshita Gupta*Debanjan Mondal*Akshay Krishna Sheshadri*Niket Tandon*
2023
EMNLP

Editing model parameters directly in Transformers makes updating open-source transformer-based models possible without re-training. However, these editing methods have only been evaluated on… 

Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy

Sarah WiegreffeMatthew FinlaysonOyvind TafjordAshish Sabharwal
2023
EMNLP

When pretrained language models (LMs) are applied to discriminative tasks such as multiple-choice questions, they place probability mass on vocabulary tokens that aren't among the given answer… 

Language Models with Rationality

Nora KassnerOyvind TafjordAshish SabharwalPeter Clark
2023
EMNLP

While large language models (LLMs) are proficient at question-answering (QA), the dependencies between their answers and other "beliefs" they may have about the world are typically unstated, and may… 

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations

Kavel RaoLiwei JiangValentina PyatkinYejin Choi
2023
Conference on Empirical Methods in Natural Language Processing • Findings

Moral or ethical judgments rely heavily on the specific contexts in which they occur. Understanding varying shades of defeasible contextualizations (i.e., additional information that strengthens or… 

Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena

Jiangjie ChenSiyu YuanRong YeKyle RichardsonKyle Richardson
2023
arXiv

Can Large Language Models (LLMs) simulate human behavior in complex environments? LLMs have recently been shown to exhibit advanced reasoning skills but much of NLP evaluation still relies on static… 

Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations

Nirbhay ModheQiaozi GaoA. KalyanG. Sukhatme
2023
arXiv.org

Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions. Model-free… 

DISCO: Distilling Phrasal Counterfactuals with Large Language Models

Zeming ChenQiyue GaoKyle RichardsonAshish Sabharwal
2023
ACL

Recent methods demonstrate that data augmentation using counterfactual knowledge can teach models the causal structure of a task, leading to robust and generalizable models. However, such… 

Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Harsh TrivediNiranjan BalasubramanianTushar KhotAshish Sabharwal
2023
ACL

Prompting-based large language models (LLMs) are surprisingly powerful at generating natural language reasoning steps or Chains-of-Thoughts (CoT) for multi-step question answering (QA). They… 

Do language models have coherent mental models of everyday things?

Yuling GuBhavana Dalvi MishraPeter Clark
2023
ACL

When people think of everyday things like an “egg,” they typically have a mental image associated with it. This commonsense knowledge helps us understand how these everyday things work and how to… 

RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

Afra Feyza AkyurekEkin AkyürekAman MadaanNiket Tandon
2023
Annual Meeting of the Association for Computational Linguistics

Despite their unprecedented success, even the largest language models make mistakes.Similar to how humans learn and improve using feedback, previous work proposed providing language models with…