Papers
See AI2's Award Winning Papers
Learn more about AI2's Lasting Impact Award
Viewing 651-660 of 1022 papers
Multi-class Hierarchical Question Classification for Multiple Choice Science Exams
Dongfang Xu, Peter Jansen, Jaycie Martin, Zhengnan Xie, Vikas Yadav, Harish Tayyar Madabushi, Oyvind Tafjord, Peter ClarkIJCAI • 2020 Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data…Transformers as Soft Reasoners over Language
Peter Clark, Oyvind Tafjord, Kyle RichardsonIJCAI • 2020 AI has long pursued the goal of having systems reason over explicitly provided knowledge, but building suitable representations has proved challenging. Here we explore whether transformers can similarly learn to reason (or emulate reasoning), but using rules…TransOMCS: From Linguistic Graphs to Commonsense Knowledge
Hongming Zhang, Daniel Khashabi, Yangqiu Song, Dan RothIJCAI • 2020 Commonsense knowledge acquisition is a key problem for artificial intelligence. Conventional methods of acquiring commonsense knowledge generally require laborious and costly human annotations, which are not feasible on a large scale. In this paper, we…CORD-19: The Covid-19 Open Research Dataset
L. Lu Wang, K. Lo, Y. Chandrasekhar, R. Reas, J. Yang, D. Eide, K. Funk, R. Kinney, Z. Liu, W. Merrill, P. Mooney, D. Murdick, D. Rishi, J. Sheehan, Z. Shen, B. Stilson, A. D Wade, K. Wang, C. Wilhelm, B. Xie, D.Raymond, D. S Weld, O. Etzioni, S. KohlmeierACL • NLP-COVID • 2020 The Covid-19 Open Research Dataset (CORD-19) is a growing 1 resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich…SUPP. AI: finding evidence for supplement-drug interactions
Lucy Lu Wang, Oyvind Tafjord, Arman Cohan, Sarthak Jain, Sam Skjonsberg, Carissa Schoenick, Nick Botner, Waleed AmmarACL• Demo • 2020 Dietary supplements are used by a large portion of the population, but information on their pharmacologic interactions is incomplete. To address this challenge, we present this http URL, an application for browsing evidence of supplement-drug interactions…A Formal Hierarchy of RNN Architectures
William. Merrill, Gail Garfinkel Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran YahavACL • 2020 We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be…A Mixture of h-1 Heads is Better than h Heads
Hao Peng, Roy Schwartz, Dianqi Li, Noah A. SmithACL • 2020 Multi-head attentive neural architectures have achieved state-of-the-art results on a variety of natural language processing tasks. Evidence has shown that they are overparameterized; attention heads can be pruned without significant performance loss. In this…A Two-Stage Masked LM Method for Term Set Expansion
Guy Kushilevitz, Shaul Markovitch, Yoav GoldbergACL • 2020 We tackle the task of Term Set Expansion (TSE): given a small seed set of example terms from a semantic class, finding more members of that class. The task is of great practical utility, and also of theoretical utility as it requires generalization from few…Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, Noah A. SmithACL • 2020 Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target…Improving Transformer Models by Reordering their Sublayers
Ofir Press, Noah A. Smith, Omer LevyACL • 2020 Multilayer transformer networks consist of interleaved self-attention and feedforward sublayers. Could ordering the sublayers in a different pattern lead to better performance? We generate randomly ordered transformers and train them with the language…