Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge

Phillip HowardJunlin WangVasudev LalSwabha Swayamdipta
2024
NAACL

Comparative knowledge (e.g., steel is stronger and heavier than styrofoam) is an essential component of our world knowledge, yet understudied in prior literature. In this paper, we harvest the… 

Promptly Predicting Structures: The Return of Inference

Maitrey MehtaValentina PyatkinVivek Srikumar
2024
NAACL

Prompt-based methods have been used extensively across NLP to build zero- and few-shot label predictors. Many NLP tasks are naturally structured: that is, their outputs consist of multiple labels… 

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

Wenting ZhaoJustin T ChiuJena D. HwangAlane Suhr
2024
NAACL

Language technologies that accurately model the dynamics of events must perform commonsense reasoning. Existing work evaluating commonsense reasoning focuses on making inferences about common,… 

Improving Language Models with Advantage-based Offline Policy Gradients

Ashutosh BahetiXiming LuFaeze BrahmanMark O. Riedl
2024
ICLR

Language Models (LMs) achieve substantial language capabilities when finetuned using Reinforcement Learning with Human Feedback (RLHF). However, RLHF is an unstable and data-hungry process that… 

WildChat: 1M ChatGPT Interaction Logs in the Wild

Wenting ZhaoXiang RenJ. HesselYuntian Deng
2024
ICLR

Chatbots such as GPT-4 and ChatGPT are now serving millions of users. Despite their widespread use, there remains a lack of public datasets showcasing how these tools are used by a population of… 

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Niloofar MireshghallahHyunwoo KimXuhui ZhouYejin Choi
2024
ICLR

The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from… 

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

Linlu QiuLiwei JiangXiming LuXiang Ren
2024
ICLR

The ability to derive underlying principles from a handful of observations and then generalize to novel situations -- known as inductive reasoning -- is central to human intelligence. Prior work… 

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

Faeze BrahmanChandra BhagavatulaValentina PyatkinYejin Choi
2024
ICLR

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense… 

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting

Melanie SclarYejin ChoiYulia TsvetkovAlane Suhr
2024
ICLR

As large language models (LLMs) are adopted as a fundamental component of language technologies, it is crucial to accurately characterize their performance. Because choices in prompt design can… 

Tailoring Self-Rationalizers with Multi-Reward Distillation

Sahana RamnathBrihi JoshiSkyler HallinanXiang Ren
2024
ICLR

Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant…