Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Digital Socrates: Evaluating LLMs through explanation critiques

Yuling GuOyvind TafjordPeter Clark
2024
ACL

While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood. In response, our goal is to define a detailed way of… 

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

Zichen ZhangYunshuang LiOsbert BastaniLuca Weihs
2024
IEEE International Conference on Robotics and Automation

Real-world robotic tasks stretch over extended horizons and encompass multiple stages. Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the… 

A Design Space for Intelligent and Interactive Writing Assistants

Mina LeeKaty Ilonka GeroJohn Joon Young ChungPao Siangliulue
2024
CHI

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge… 

Mitigating Barriers to Public Social Interaction with Meronymous Communication

Nouran SolimanHyeonsu B. KangMatthew LatzkeDavid R. Karger
2024
CHI

In communities with social hierarchies, fear of judgment can discourage communication. While anonymity may alleviate some social pressure, fully anonymous spaces enable toxic behavior and hide the… 

PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers

Yoonjoo LeeHyeonsu B KangMatt LatzkePao Siangliulue
2024
CHI

With the rapid growth of scholarly archives, researchers subscribe to"paper alert"systems that periodically provide them with recommendations of recently published papers that are similar to… 

Improving Language Models with Advantage-based Offline Policy Gradients

Ashutosh BahetiXiming LuFaeze BrahmanMark O. Riedl
2024
ICLR

Language Models (LMs) achieve substantial language capabilities when finetuned using Reinforcement Learning with Human Feedback (RLHF). However, RLHF is an unstable and data-hungry process that… 

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

Shashank GuptaVaishnavi ShrivastavaA. DeshpandeTushar Khot
2024
ICLR

Recent works have showcased the ability of LLMs to embody diverse personas in their responses, exemplified by prompts like 'You are Yoda. Explain the Theory of Relativity.' While this ability allows… 

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models

Qingqing CaoSewon MinYizhong WangHannaneh Hajishirzi
2024
ICLR

Retrieval augmentation addresses many critical problems in large language models such as hallucination, staleness, and privacy leaks. However, running retrieval-augmented language models (LMs) is… 

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Pan LuHritik BansalTony XiaJianfeng Gao
2024
ICLR

Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts… 

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari AsaiZeqiu WuYizhong WangHannaneh Hajishirzi
2024
ICLR

Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to their sole reliance on the parametric knowledge they encapsulate.…