Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints
Conditional text generation often requires lexical constraints, i.e., which words should or shouldn’t be included in the output text. While the dominant recipe for conditional text generation has…
Paraphrasing vs Coreferring: Two Sides of the Same Coin
We study the potential synergy between two different NLP tasks, both confronting lexical variability: identifying predicate paraphrases and event coreference resolution. First, we used annotations…
Generative Data Augmentation for Commonsense Reasoning
Recent advances in commonsense reasoning depend on large-scale human-annotated training data to achieve peak performance. However, manual curation of training examples is expensive and has been…
Evaluating Models' Local Decision Boundaries via Contrast Sets
Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading:…
Learning Object Detection from Captions via Textual Scene Attributes
Object detection is a fundamental task in computer vision, requiring large annotated datasets that are difficult to collect, as annotators need to label objects and their bounding boxes. Thus, it is…
Scene Graph to Image Generation with Contextualized Object Layout Refinement
Generating high-quality images from scene graphs, that is, graphs that describe multiple entities in complex relations, is a challenging task that attracted substantial interest recently. Prior work…
Modelling kidney disease using ontology: insights from the Kidney Precision Medicine Project
An important need exists to better understand and stratify kidney disease according to its underlying pathophysiology in order to develop more precise and effective therapeutic agents. National…
Span-based Semantic Parsing for Compositional Generalization
Despite the success of sequence-tosequence (seq2seq) models in semantic parsing, recent work has shown that they fail in compositional generalization, i.e., the ability to generalize to new…
GFDL SHiELD: A Unified System for Weather-to-Seasonal Prediction
We present the System for High-resolution prediction on Earth-to-Local Domains (SHiELD), an atmosphere model developed by the Geophysical Fluid Dynamics Laboratory (GFDL) coupling the nonhydrostatic…
What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge
Open-domain question answering (QA) is known to involve several underlying knowledge and reasoning challenges, but are models actually learning such knowledge when trained on benchmark tasks? To…