
See AI2's full collection of videos on our YouTube channel.
Viewing 21-30 of 261 videos
  • On Parameter Efficiency of Neural Language Models Thumbnail

    On Parameter Efficiency of Neural Language Models

    November 2, 2023  |  Chen Liang
    Abstract: Pre-trained neural language models have demonstrated remarkable generalizability in various downstream tasks, such as natural language understanding and question answering. However, these models have grown to contain hundreds of billions of parameters, making them difficult to be deployed in…
  • Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing Thumbnail

    Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing

    November 2, 2023  |  Tom Sherborne
    Abstract: Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods, however, exploiting few-shot gold data is…
  • Benchmarking Compositionality with Formal Languages Thumbnail

    Benchmarking Compositionality with Formal Languages

    November 1, 2023  |  Josef Valvoda
    Abstract: Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability. Whether large neural models in NLP can acquire this ability while learning from data is an open question. In this paper, we investigate this problem from the perspective of formal…
  • Studying Large Language Model Generalization with Influence Functions Thumbnail

    Studying Large Language Model Generalization with Influence Functions

    October 31, 2023  |  Roger Grosse
    Abstract: When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior? Influence functions aim to answer a counterfactual: how would the…
  • Modular Language Models Thumbnail

    Modular Language Models

    October 16, 2023  |  Suchin Gururangan
    Conventional language models (LMs) are trained densely: all parameters are updated with respect to all data. We argue that dense training leads to a variety of well-documented issues with LMs, including their prohibitive training cost and unreliable downstream behavior. We then introduce a new class of LMs that…
  • Towards Cost-Efficient Use of Pre-trained Models Thumbnail

    Towards Cost-Efficient Use of Pre-trained Models

    October 10, 2023  |  Alan Ritter
    Abstract: Large language models are leading to many exciting breakthroughs, but this comes at a significant cost in terms of both computational and data labeling expenses. Training state-of-the-art models requires access to high-end GPUs for pre-training and inference, in addition to labeled data for fine-tuning…
  • Reliability and interactive debugging for language models Thumbnail

    Reliability and interactive debugging for language models

    October 6, 2023  |  Bhargavi Paranjape
    Abstract: Large language models have permeated our everyday lives and are used in critical decision making scenarios that can affect millions of people. Despite their impressive progress, model deficiencies may result in exacerbating harmful biases or lead to catastrophic failures. In this talk, I discuss several…
  • The University of Washington eScience Institute: a Home for Data-Intensive Discovery Thumbnail

    The University of Washington eScience Institute: a Home for Data-Intensive Discovery

    Abstract: The University of Washington eScience Institute, one of the nation's first university data science institutes, grew out of the Moore-Sloan Data Science Environment effort which was focused on identifying and tackling impediments to the broad and sustainable adoption of data-intensive discovery. With a…
  • Reliable Evaluation and High-Quality Data: Building Blocks for Helpful Question Answering Systems Thumbnail

    Reliable Evaluation and High-Quality Data: Building Blocks for Helpful Question Answering Systems

    September 26, 2023  |  Ehsan Kamalloo
    Abstract: As models continue to rapidly evolve in complexity and scale, the status quo of how they are being evaluated and the quality of benchmarks has not significantly changed. This inertia leaves challenges in evaluation and data quality unaddressed, which results in the potential for erroneous conclusions…
  • Vision Without Labels Thumbnail

    Vision Without Labels

    September 13, 2023  |  Bharath Hariharan/Cornell University
    Bio: Bharath Hariharan is an assistant professor at Cornell University. He works on problems in computer vision and machine learning that defy the big data label. He did his PhD at University of California, Berkeley with Jitendra Malik. His work has been recognized with an NSF CAREER and a PAMI Young Researcher…