Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

S2AND: A Benchmark and Evaluation System for Author Name Disambiguation

Shivashankar SubramanianDaniel KingDoug DowneySergey Feldman
2021
JCDL

Author Name Disambiguation (AND) is the task of resolving which author mentions in a bibliographic database refer to the same real-world person, and is a critical ingredient of digital library… 

COVR: A test-bed for Visually Grounded Compositional Generalization with real images

Ben BoginShivanshu GuptaMatt GardnerJonathan Berant
2021
EMNLP

While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In… 

Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules

Forough ArabshahiJennifer LeeA. BosselutTom Mitchell
2021
EMNLP

One of the challenges faced by conversational agents is their inability to identify unstated presumptions of their users’ commands, a task trivial for humans due to their common sense. In this… 

General-Purpose Question-Answering with Macaw

Oyvind TafjordPeter Clark
2021
arXiv

Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available. In response, we present MACAW, a versatile, generative… 

Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Comp

GysiT.C. Müllerand T. Wicky
2021
ACM Transactions on Architecture and Code Optimization

Most compilers have a single core intermediate representation (IR) (e.g., LLVM) sometimes complemented with vaguely defined IR-like data structures. This IR is commonly low-level and close to… 

Factorizing Perception and Policy for Interactive Instruction Following

Kunal Pratap SinghSuvaansh BhambriByeonghwi KimJonghyun Choi
2021
arXiv

Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for AI agents. The ‘interactive instruction following’ task attempts to… 

It's not Rocket Science : Interpreting Figurative Language in Narratives

Tuhin ChakrabartyYejin ChoiVered Shwartz
2021
ACL

Figurative language is ubiquitous in English. Yet, the vast majority of NLP research focuses on literal language. Existing text representations by design rely on compositionality, while figurative… 

Question Decomposition with Dependency Graphs

Matan HassonJonathan Berant
2021
AKBC

QDMR is a meaning representation for complex questions, which decomposes questions into a sequence of atomic steps. While stateof-the-art QDMR parsers use the common sequence-to-sequence (seq2seq)… 

All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text

Elizabeth ClarkTal AugustSofia SerranoNoah A. Smith
2021
ACL

Human evaluations are typically considered the gold standard in natural language generation, but as models' fluency improves, how well can evaluators detect and judge machine-generated text? We run… 

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

Mor GevaDaniel KhashabiElad SegalJonathan Berant
2021
TACL

A key limitation in current datasets for multi-hop reasoning is that the required steps for answering the question are mentioned in it explicitly. In this work, we introduce STRATEGYQA, a question…