Skip to main content ->
Ai2

Research - Papers

Explore a selection of our published work on a variety of key research challenges in AI.

Filter papers

Generating Scientific Definitions with Controllable Complexity

Tal AugustKatharina ReineckeNoah A. Smith
2022
ACL

Unfamiliar terminology and complex language can present barriers to understanding science. Natural language processing stands to help address these issues by automatically defining unfamiliar terms.… 

Extracting Latent Steering Vectors from Pretrained Language Models

Nishant SubramaniNivedita SureshMatthew E. Peters
2022
Findings of ACL

Prior work on controllable text generation has focused on learning how to control language models through trainable decoding, smart-prompt design, or fine-tuning based on a desired objective. We… 

Productive Performance Engineering for Weather and Climate Modeling with Python

Tal Ben-NunLinus GronerFlorian DeconinckTorsten Hoefler
2022
arXiv

Earth system models are developed with a tight coupling to target hardware, often containing highly-specialized code predicated on processor characteristics. This coupling stems from using… 

Generating Scientific Claims for Zero-Shot Scientific Fact Checking

Dustin WrightDavid WaddenKyle LoLucy Lu Wang
2022
ACL

Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of training data, as annotation requires domain expertise. To address… 

Is GPT-3 Text Indistinguishable from Human Text? SCARECROW: A Framework for Scrutinizing Machine Text

Yao DouMaxwell ForbesRik Koncel-KedziorskiYejin Choi
2022
ACL

Modern neural text generation systems can produce remarkably fluent and grammatical texts. While earlier language models suffered from repetition and syntactic errors, the errors made by contemporary… 

Situated Dialogue Learning through Procedural Environment Generation

Prithviraj AmmanabroluRenee JiaMark O. Riedl
2022
ACL

We teach goal-driven agents to interactively act and speak in situated environments by training on generated curriculums. Our agents operate in LIGHT (Urbanek et al. 2019)—a large-scale… 

Draw Me a Flower: Grounding Formal Abstract Structures Stated in Informal Natural Language

Royi LachmyValentina PyatkinReut Tsarfaty
2022
ACL

Forming and interpreting abstraction is a core process in human communication. In particular, when giving and performing complex instructions stated in natural language (NL), people may naturally… 

ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts

Sonia K. MurthyKyle LoDaniel KingDoug Downey
2022
arXiv

Systems that can automatically define unfamiliar terms hold the promise of improving the accessibility of scientific texts, especially for readers who may lack prerequisite background knowledge.… 

Understanding Dataset Difficulty with 𝒱-Usable Information

Kawin EthayarajhYejin Choiand Swabha Swayamdipta
2022
ICML

Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison… 

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Wen XiaoIz BeltagyG. CareniniArman Cohan
2022
ACL

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning…