Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 123 papers
  • Embedding Recycling for Language Models

    Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Mike D'Arcy, Arman Cohan, Doug DowneyFindings of EACL2023 Training and inference with large neural models is expensive. However, for many application domains, while new tasks and models arise frequently, the underlying doc-uments being modeled remain mostly un-changed. We study how to decrease computational cost in…
  • CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context

    Joseph Chee Chang, Amy X. Zhang, Jonathan Bragg, Andrew Head, Kyle Lo, Doug Downey, Daniel S. WeldCHI2023 When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work. However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature…
  • ComLittee: Literature Discovery with Personal Elected Author Committees

    Hyeonsu B Kang, Nouran Soliman, Matt Latzke, Joseph Chee Chang, Jonathan BraggCHI2023 In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help scholars discover relevant papers and authors. Recent approaches have shown the usefulness of highlighting relevant authors…
  • Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections

    Srishti Palani, Aakanksha Naik, Doug Downey, Amy X. Zhang, Jonathan Bragg, Joseph Chee ChangCHI2023 Scholars who want to research a scientific topic must take time to read, extract meaning, and identify connections across many papers. As scientific literature grows, this becomes increasingly challenging. Meanwhile, authors summarize prior research in papers…
  • The Semantic Scholar Open Data Platform

    Rodney Michael Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Arman Cohan, Miles Crawford, Doug Downey, J. Dunkelberger, Oren Etzioni, R. Evans, Sergey Feldman, Joseph Gorney, D. Graham, F.Q. Hu, Regan Huff, Daniel King, Sebastian Kohlmeier, Bailey Kuehl, Michael Langan, Daniel Lin, Haokun Liu, Kyle Lo, Jaron Lochner, Kelsey MacMillan, Tyler Murray, Christopher Newell, Smita Rao, Shaurya Rohatgi, P. Sayre, Zejiang Shen, Amanpreet Singh, Luca Soldaini, Shivashankar Subramanian, A. Tanaka, Alex D Wade, Linda M. Wagner, Lucy Lu Wang, Christopher Wilhelm, Caroline Wu, Jiangjiang Yang, A. Zamarron, Madeleine van Zuylen, Daniel S. WeldarXiv2023 The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field. Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover…
  • I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

    Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Yejin ChoiarXiv2022 Pre-trained language models, despite their rapid advancements powered by scale, still fall short of robust commonsense capabilities. And yet, scale appears to be the win-ning recipe; after all, the largest models seem to have acquired the largest amount of…
  • Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems

    Yoshitomo Matsubara, Luca Soldaini, Eric Lind, Alessandro MoschittiFindings of EMNLP2022 Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high computational costs prevent their use in many real-world applications. In this pa-per, we explore the following research question: How can we make the AS2 models…
  • GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation

    Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, Daniel S. WeldEMNLP2022 While often assumed a gold standard, effective human evaluation of text generation remains an important, open area for research. We revisit this problem with a focus on pro-ducing consistent evaluations that are reproducible —over time and across different…
  • Knowledge Transfer from Answer Ranking to Answer Generation

    Matteo Gabburo, Rik Koncel-Kedziorski, Siddhant Garg, Luca Soldaini, Alessandro MoschittiEMNLP2022 Recent studies show that Question Answering (QA) based on Answer Sentence Selection (AS2) can be improved by generating an improved answer from the top-k ranked answer sentences (termed GenQA). This allows for synthesizing the information from multiple…
  • Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection

    Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro MoschittiEMNLP2022 An important task for designing QA systems is answer sentence selection (AS2): select-ing the sentence containing (or constituting) the answer to a question from a set of re-trieved relevant documents. In this paper, we propose three novel sentence-level…