Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents

Kyle LoZejiang ShenBenjamin NewmanLuca Soldaini
2023
EMNLP

Despite growing interest in applying natural language processing (NLP) and computer vision (CV) models to the scholarly domain, scientific documents remain challenging to work with. They’re often in… 

SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

Hyunwoo KimJack HesselLiwei JiangYejin Choi
2023
EMNLP

We present SODA : the first publicly available, million-scale high-quality social dialogue dataset. Using SODA , we train COSMO : a generalizable conversation agent outperforming previous… 

NLPositionality: Characterizing Design Biases of Datasets and Models

Sebastin SantyJenny T. LiangRonan Le BrasMaarten Sap
2023
ACL

Design biases in NLP systems, such as performance differences for different populations, often stem from their creator's positionality, i.e., views and lived experiences shaped by identity and… 

Do Androids Laugh at Electric Sheep? Humor"Understanding"Benchmarks from The New Yorker Caption Contest

Jack HesselAna MarasovićJena D. HwangYejin Choi
2023
ACL

We challenge AI models to “demonstrate un-derstanding” of the sophisticated multimodal humor of The New Yorker Caption Contest. Concretely, we develop three carefully cir-cumscribed tasks for which… 

Visual Programming: Compositional visual reasoning without training

Tanmay GuptaAniruddha Kembhavi
2023
CVPR

We present VISPROG, a neuro-symbolic approach to solving complex and compositional visual tasks given natural language instructions. VISPROG avoids the need for any task-specific training. Instead,… 

The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks

Nikil SelvamSunipa DevDaniel KhashabiKai-Wei Chang
2023
ACL

How reliably can we trust the scores obtained from social bias benchmarks as faithful indicators of problematic social biases in a given language model? In this work, we study this question by… 

Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker

Melanie SclarSachin KumarPeter WestYulia Tsvetkov
2023
ACL

Theory of Mind (ToM)$\unicode{x2014}$the ability to reason about the mental states of other people$\unicode{x2014}$is a key element of our social intelligence. Yet, despite their ever more… 

LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization

Kalpesh KrishnaErin BransomBailey KuehlKyle Lo
2023
EACL

While human evaluation remains best practice for accurately judging the faithfulness of automatically-generated summaries, few solutions exist to address the increased difficulty and workload when… 

CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context

Joseph Chee ChangAmy X. ZhangJonathan BraggDaniel S. Weld
2023
CHI

When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work. However, it can be challenging to prioritize and make sense of… 

Queer In AI: A Case Study in Community-Led Participatory AI

Organizers Of Queer in AIAnaelia OvalleArjun SubramonianLuke Stark
2023
FAccT

We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over…