Papers

Learn more about AI2's Lasting Impact Award
Viewing 1-10 of 141 papers
  • Bound by the Bounty: Collaboratively Shaping Evaluation Processes for Queer AI Harms

    Organizer of Queer In AI, Nathaniel Dennler, Anaelia Ovalle, Ashwin Singh, Luca Soldaini, Arjun Subramonian, Huy Tu, William Agnew, Avijit Ghosh, Kyra Yee, Irene Font Peradejordi, Zeerak Talat, Mayra Russo, Jessica de Jesus de Pinho PinhalAIES2023 Bias evaluation benchmarks and dataset and model documentation have emerged as central processes for assessing the biases and harms of artificial intelligence (AI) systems. However, these auditing processes have been criticized for their failure to integrate…
  • Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents

    Catherine Chen, Zejiang Shen, Dan Klein, Gabi Stanovsky, Doug Downey, Kyle LoFindings of ACL2023 Recent work has shown that infusing layout features into language models (LMs) improves processing of visually-rich documents such as scientific papers. Layout-infused LMs are often evaluated on documents with familiar layout features (e.g., papers from the…
  • Riveter: Measuring Power and Social Dynamics Between Entities

    Maria Antoniak, Anjalie Field, Jimin Mun, Melanie Walsh, Lauren F. Klein, Maarten SapACL2023 Riveter provides a complete easy-to-use pipeline for analyzing verb connotations associated with entities in text corpora. We prepopulate the package with connotation frames of sentiment, power, and agency, which have demonstrated usefulness for capturing…
  • Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

    Li Lucy, Jesse Dodge, David Bamman, Katherine A. KeithFindings of ACL2023 Scholarly text is often laden with jargon, or specialized language that can facilitate efficient in-group communication within fields but hinder understanding for out-groups. In this work, we develop and validate an interpretable approach for measuring…
  • Estimating the Causal Effect of Early ArXiving on Paper Acceptance

    Yanai Elazar, Jiayao Zhang, David Wadden, Boshen Zhang, Noah A. SmitharXiv.org2023 What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this question. We use data from the ICLR conference (2018--2022) and…
  • A Controllable QA-based Framework for Decontextualization

    Benjamin Newman, Luca Soldaini, Raymond Fok, Arman Cohan, Kyle LoarXiv2023 Many real-world applications require surfacing extracted snippets to users, whether motivated by assistive tools for literature surveys or document cross-referencing, or needs to mitigate and recover from model generated inaccuracies., Yet, these passages can…
  • Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition Extraction

    Anna Martin-Boyle, Andrew Head, Kyle Lo, Risham Sidhu, Marti A. Hearst, Dongyeop KangarXiv2023 Mathematical symbol definition extraction is important for improving scholarly reading interfaces and scholarly information extraction (IE). However, the task poses several challenges: math symbols are difficult to process as they are not composed of natural…
  • Decomposing Complex Queries for Tip-of-the-tongue Retrieval

    Kevin Lin, Kyle Lo, Joseph E. Gonzalez, Dan KleinarXiv2023 When re-finding items, users who forget or are uncertain about identifying details often rely on creative strategies for expressing their information needs -- complex queries that describe content elements (e.g., book characters or events), information beyond…
  • TESS: Text-to-Text Self-Conditioned Simplex Diffusion

    Rabeeh Karimi Mahabadi, Jaesung Tae, Hamish Ivison, J. Henderson, Iz Beltagy, Matthew E. Peters, Arman CohanarXiv2023 Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various domains with continuous-valued inputs. Despite the promises of fully non-autoregressive text generation, applying diffusion models to natural language…
  • Embedding Recycling for Language Models

    Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Mike D'Arcy, Arman Cohan, Doug DowneyFindings of EACL2023 Training and inference with large neural models is expensive. However, for many application domains, while new tasks and models arise frequently, the underlying doc-uments being modeled remain mostly un-changed. We study how to decrease computational cost in…