An abstract illustration of swirling shapes, meant to denote a futuristic feeling.

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

General-Purpose Question-Answering with Macaw

Oyvind TafjordPeter Clark

2021

arXiv

Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available. In response, we present MACAW, a versatile, generative…

Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Comp

GysiT.C. Müllerand T. Wicky

2021

ACM Transactions on Architecture and Code Optimization

Most compilers have a single core intermediate representation (IR) (e.g., LLVM) sometimes complemented with vaguely defined IR-like data structures. This IR is commonly low-level and close to…

Factorizing Perception and Policy for Interactive Instruction Following

Kunal Pratap SinghSuvaansh BhambriByeonghwi KimJonghyun Choi

2021

arXiv

Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for AI agents. The ‘interactive instruction following’ task attempts to…

It's not Rocket Science : Interpreting Figurative Language in Narratives

Tuhin ChakrabartyYejin ChoiVered Shwartz

2021

ACL

Figurative language is ubiquitous in English. Yet, the vast majority of NLP research focuses on literal language. Existing text representations by design rely on compositionality, while figurative…

Question Decomposition with Dependency Graphs

Matan HassonJonathan Berant

2021

AKBC

QDMR is a meaning representation for complex questions, which decomposes questions into a sequence of atomic steps. While stateof-the-art QDMR parsers use the common sequence-to-sequence (seq2seq)…

All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text

Elizabeth ClarkTal AugustSofia SerranoNoah A. Smith

2021

ACL

Human evaluations are typically considered the gold standard in natural language generation, but as models' fluency improves, how well can evaluators detect and judge machine-generated text? We run…

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

Mor GevaDaniel KhashabiElad SegalJonathan Berant

2021

TACL

A key limitation in current datasets for multi-hop reasoning is that the required steps for answering the question are mentioned in it explicitly. In this work, we introduce STRATEGYQA, a question…

Edited Media Understanding Frames: Reasoning about the Intent and Implications of Visual Disinformation

Jeff DaMaxwell ForbesRowan ZellersYejin Choi

2021

ACL

Multimodal disinformation, from `deepfakes' to simple edits that deceive, is an important societal problem. Yet at the same time, the vast majority of media edits are harmless -- such as a filtered…

Effective Attention Sheds Light On Interpretability

Kaiser Sun and Ana Marasović

2021

Findings of ACL

An attention matrix of a transformer selfattention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output. This leads us…

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

Alexis RossAna MarasovićMatthew E. Peters

2021

Findings of ACL

Humans give contrastive explanations that explain why an observed event happened rather than some other counterfactual event (the contrast case). Despite the important role that contrastivity plays…

Previous571-580Next