Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Cross-Lingual GenQA: Open-Domain Question Answering with Answer Sentence Generation
Recent approaches for question answering systems have achieved impressive performance on English by combining document-level retrieval with answer generation. These approaches, which we refer to as…
One Venue, Two Conferences: The Separation of Chinese and American Citation Networks
At NeurIPS, American and Chinese institutions cite papers from each other’s regions substantially less than they cite endogamously. We build a citation graph to quantify this divide, compare it to…
I Can't Believe There's No Images! Learning Visual Tasks Using only Language Data
Many high-level skills that are required for computer vision tasks, such as parsing questions, comparing and con-trasting semantics, and writing descriptions, are also required in other domains such…
Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
Can we teach natural language understanding models to track their beliefs through intermediate points in text? We propose a representation learning framework called breakpoint modeling that allows…
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption,…
Exploring Team-Sourced Hyperlinks to Address Navigation Challenges for Low-Vision Readers of Scientific Papers
Reading academic papers is a fundamental part of higher education and research, but navigating these information-dense texts can be challenging. In particular, low-vision readers using magnification…
NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?
While a substantial body of prior work has explored adversarial example generation for natural language understanding tasks, these examples are often unrealistic and diverge from the real-world data…
Quantifying the narrative flow of imagined versus autobiographical stories.
Lifelong experiences and learned knowledge lead to shared expectations about how common situations tend to unfold. Such knowledge of narrative event flow enables people to weave together a story.…
Generating Sequences by Learning to Self-Correct
Sequence generation applications require satisfying semantic constraints, such as ensuring that programs are correct, using certain keywords, or avoiding undesir-able content. Language models,…
Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts
Explicit decomposition modeling, which involves breaking down complex tasks into more straightforward and often more interpretable sub-tasks, has long been a central theme in developing robust and…