Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Grounded Situation Recognition
We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with…
Spatially Aware Multimodal Transformers for TextVQA
Textual cues are essential for everyday tasks like buying groceries and using public transport. To develop this assistive technology, we study the TextVQA task, i.e., reasoning about text in images…
VisualCOMET: Reasoning About the Dynamic Context of a Still Image
Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat…
Approximating the Permanent by Sampling from Adaptive Partitions
Computing the permanent of a non-negative matrix is a core problem with practical applications ranging from target tracking to statistical thermodynamics. However, this problem is also #P-complete,…
High-Precision Extraction of Emerging Concepts from Scientific Literature
Identification of new concepts in scientific literature can help power faceted search, scientific trend analysis, knowledge-base construction, and more, but current methods are lacking. Manual…
Dawn: A high-level domain-specific language compiler toolchain for weather and climate applications
High-level programming languages that allow to express numerical methods and generate efficient parallel implementations are of key importance for the productivity of domain-scientists. The…
Break It Down: A Question Understanding Benchmark
Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning…
oLMpics - On what Language Model Pre-training Captures
Recent success of pre-trained language models (LMs) has spurred widespread interest in the language capabilities that they possess. However, efforts to understand whether LM representations are…
Adversarial Filters of Dataset Biases
Large neural models have demonstrated humanlevel performance on language and vision benchmarks such as ImageNet and Stanford Natural Language Inference (SNLI). Yet, their performance degrades…
Multi-class Hierarchical Question Classification for Multiple Choice Science Exams
Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been…