Menu
Viewing 21-40 of 130 videos See AI2’s full collection of videos on our YouTube channel.
    • April 10, 2018

      Jesse Dodge

      Driven by the need for parallelizable hyperparameter optimization methods, we study open loop search methods: sequences that are predetermined and can be generated before a single configuration is evaluated. Examples include grid search, uniform random search, low discrepancy sequences, and other sampling distributions. In particular, we propose the use of k-determinantal point processes in hyperparameter optimization via random search. Compared to conventional uniform random search where hyperparameter settings are sampled independently, a k-DPP promotes diversity. We describe an approach that transforms hyperparameter search spaces for efficient use with a k-DPP. In addition, we introduce a novel Metropolis-Hastings algorithm which can sample from k-DPPs defined over any space from which uniform samples can be drawn, including spaces with a mixture of discrete and continuous dimensions or tree structure. Our experiments show significant benefits when tuning hyperparameters to neural models for text classification, with a limited budget for training supervised learners, whether in serial or parallel.

      Less More
    • April 2, 2018

      Rama Vedantam

      Understanding how to model vision and language jointly is a long-standing challenge in artificial intelligence. Vision is one of the primary sensors we use to perceive the world, while language is our data structure to represent and communicate knowledge. In this talk, we will take up three lines of attack to this problem: interpretation, grounding, and imagination. In interpretation, the goal will be to get machine learning models to understand an image and describe its contents using natural language in a contextually relevant manner. In grounding, we will connect natural language to referents in the physical world, and show how this can help learn common sense. Finally, we will study how to ‘imagine’ visual concepts completely and accurately across the full range and (potentially unseen) compositions of their visual attributes. We will study these problems from computational as well as algorithmic perspectives and suggest exciting directions for future work.

      Less More
    • March 30, 2018

      Keisuke Sakaguchi

      Robustness has always been a desirable property for natural language processing. In many cases, NLP models (e.g., parsing) and downstream applications (e.g., MT) perform poorly when the input contains noise such as spelling errors, grammatical errors, and disfluency. In this talk, I will present three recent results on error correction models: character, word, and sentence level respectively. For character level, I propose semi-character recurrent neural network, which is motivated by a finding in Psycholinguistics, called Cmabrigde Uinervtisy (Cambridge University) effect. For word-level robustness, I propose an error-repair dependency parsing algorithm for ungrammatical texts. The algorithm can parse sentences and correct grammatical errors simultaneously. Finally, I propose a neural encoder-decoder model with reinforcement learning for sentence-level error correction. To avoid exposure bias in standard encoder-decoders, the model directly optimizes towards a metric for grammatical error correction performance.

      Less More
    • March 28, 2018

      Arun Chaganty

      A significant challenge in developing systems for tasks such as knowledge base population, text summarization or question answering is simply evaluating their performance: existing fully-automatic evaluation techniques rely on an incomplete set of “gold” annotations that can not adequately cover the range of possible outputs of such systems and lead to systematic biases against many genuinely useful system improvements. In this talk, I’ll present our work on how we can eliminate this bias by incorporating on-demand human feedback without incurring the full cost of human evaluation. Our key technical innovation is the design of good statistical estimators that are able to tradeoff cost for variance reduction. We hope that our work will enable the development of better NLP systems by making unbiased natural language evaluation practical and easy to use.

      Less More
    • March 26, 2018

      Chenyan Xiong

      Search engines and other information systems have started to evolve from retrieving documents to providing more intelligent information access. However, the evolution is still in its infancy due to computers’ limited ability in representing and understanding human language. This talk will present my work addressing these challenges with knowledge graphs. The first part is about utilizing entities from knowledge graphs to improve search. I will discuss how we build better text representations with entities and how the entity-based text representations improve text retrieval. The second part is about better text understanding through modeling entity salience (importance), as well as how the improved text understanding helps search under both feature-based and neural ranking settings. This talk concludes with future directions towards the next generation of intelligent information systems.

      Less More
    • March 7, 2018

      Yonatan Belinkov

      Language technology has become pervasive in everyday life, powering applications like Apple’s Siri or Google’s Assistant. Neural networks are a key component in these systems thanks to their ability to model large amounts of data. Contrary to traditional systems, models based on deep neural networks (a.k.a. deep learning) can be trained in an end-to-end fashion on input-output pairs, such as a sentence in one language and its translation in another language, or a speech utterance and its transcription. The end-to-end training paradigm simplifies the engineering process while giving the model flexibility to optimize for the desired task. This, however, often comes at the expense of model interpretability: understanding the role of different parts of the deep neural network is difficult, and such models are often perceived as “black-box”. In this work, I study deep learning models for two core language technology tasks: machine translation and speech recognition. I advocate an approach that attempts to decode the information encoded in such models while they are being trained. I perform a range of experiments comparing different modules, layers, and representations in the end-to-end models. The analyses illuminate the inner workings of end-to-end machine translation and speech recognition systems, explain how they capture different language properties, and suggest potential directions for improving them. The methodology is also applicable to other tasks in the language domain and beyond.

      Less More
    • March 2, 2018

      Peter Jansen

      Modern question answering systems are able to provide answers to a set of common natural language questions, but their ability to answer complex questions, or provide compelling explanations or justifications for why their answers are correct is still quite limited. These limitations are major barriers in high-impact domains like science and medicine, where the cost of making errors is high, and user trust is paramount. In this talk I'll discuss our recent work in developing systems that can build explanations to answer questions by aggregating information from multiple sources (sometimes called multi-hop inference). Aggregating information is challenging, particularly as the amount of information becomes large due to "semantic drift", or the tendency for inference algorithms to quickly move off-topic when assembling long chains of knowledge. Motivated by our earlier efforts in attempting to latently learn information aggregation for explanation generation (which is currently limited to short inference chains), I will discuss our current efforts to build a large corpus of detailed explanations expressed as lexically-connected explanation graphs to serve as training data for the multi-hop inference task. We will discuss characterizing what's in a science exam explanation, difficulties and methods for large-scale construction of detailed explanation graphs, and the possibility of automatically extracting common explanatory patterns from corpora such as this to support building large explanations (i.e. six or more aggregated facts) for unseen questions through merging, adapting, and adding to known explanatory patterns.

      Less More
    • February 27, 2018

      Rob Speer and Catherine Havasi

      We are the developers of ConceptNet, a long-running knowledge representation project that originated from crowdsourcing. We demonstrate systems that we’ve made by adding the common knowledge in ConceptNet to current techniques in distributional semantics. This produces word embeddings that are state-of-the-art at semantic similarity in multiple languages, analogies that perform like a moderately-educated human on the SATs, the ability to find relevant distinctions between similar words, and the ability to propose new knowledge-graph edges and “sanity check” them against existing knowledge.

      Less More
    • February 26, 2018

      Luheng He

      Semantic role labeling (SRL) systems aim to recover the predicate-argument structure of a sentence, to determine “who did what to whom”, “when”, and “where”. In this talk, I will describe my recent SRL work showing that relatively simple and general purpose neural architectures can lead to significant performance gains, including a over 40% error reduction over long-standing pre-neural performance levels. These approaches are relatively simple because they process the text in an end-to-end manner, without relying on the typical NLP pipeline (e.g. POS-tagging or syntactic parsing). They are general purpose because, with only slight modifications, they can be used to learn state-of-the-art models for related semantics problems. The final architecture I will present, which we call Labeled Span Graph Networks (LSGNs), opens up exciting opportunities to build a single, unified model for end-to-end, document-level semantic analysis.

      Less More
    • February 13, 2018

      Oren Etzioni

      Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence, gave the keynote address at the winter meeting of the Government-University-Industry Research Roundtable (GUIRR) on "Artificial Intelligence and Machine Learning to Accelerate Translational Research".

      Less More
    • February 12, 2018

      Richard Zhang

      We explore the use of deep networks for image synthesis, both as a graphics goal and as an effective method for representation learning. We propose BicycleGAN, a general system for image-to-image translation problems, with the specific aim of capturing the multimodal nature of the output space. We study image colorization in greater detail and develop automatic and user-guided approaches. Moreover, colorization, as well as cross-channel prediction in general, is a simple but powerful pretext task for self-supervised feature learning. Not only does the network solve the direct graphics task, it also learns to capture patterns in the visual world, even without the benefit of human-curated labels. We demonstrate strong transfer to high-level semantic tasks, such as image classification, and to low-level human perceptual judgments. For the latter, we collect a large-scale dataset of human similarity judgments and find that our method outperforms traditional metrics such as PSNR and SSIM. We also discover that many unsupervised and self-supervised methods transfer strongly, even comparable to fully-supervised methods.

      Less More
    • January 17, 2018

      Alexander Rush

      Early successes in deep generative models of images have demonstrated the potential of using latent representations to disentangle structural elements. These techniques have, so far, been less useful for learning representations of discrete objects such as sentences. In this talk I will discuss two works on learning different types of latent structure: Structured Attention Networks, a model for learning a soft-latent approximation of the discrete structures such as segmentations, parse trees, and chained decisions; and Adversarially Regularized Autoencoders, a new GAN-based autoencoder for learning continuous representations of sentences with applications to textual style transfer. I will end by discussing an empirical analysis of some issues that make latent structure discovery of text difficult.

      Less More
    • November 21, 2017

      Danqi Chen

      Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved, goal of NLP. This task of reading comprehension (i.e., question answering over a passage of text) has received a resurgence of interest, due to the creation of large-scale datasets and well-designed neural network models.

      Less More
    • November 20, 2017

      Jacob Walker

      Understanding the temporal dimension of images is a fundamental part of computer vision. Humans are able to interpret how the entities in an image will change over time. However, it has only been relatively recently that researchers have focused on visual forecasting—getting machines to anticipate events in the visual world before they actually happen. This aspect of vision has many practical implications in tasks ranging from human-computer interaction to anomaly detection. In addition, temporal prediction can serve as a task for representation learning, useful for various other recognition problems.

      Less More
    • November 17, 2017

      Sun Kim

      PubMed is a biomedical literature search engine, hosting more than 27 million bibliographic records. With the abundance and diversity of information in PubMed, many queries retrieve thousands of documents, making it difficult for users to identify the information relevant to their topic of interest. Unlike more general domains, the language of biomedicine uses abundant technical jargon to describe scientific discoveries and applications. To understand the semantics of biomedical text, it is important to identify not only the meanings of individual words, but also of multi-word phrases appearing in text. Controlled vocabularies may help, but the rapid growth of PubMed makes it hard to keep up with the new information.

      Less More
    • November 7, 2017

      Mohammad Rasooli

      Transfer methods have been shown to be effective alternatives for developing accurate natural language processing systems in the absence of annotated data in the target language of interest. They are divided into two approaches: 1) annotation projection from translation data using supervised models in resource-rich languages; and 2) direct transfer from resource-rich annotated datasets. In this talk, we review our past work on improving over both of the approaches by applying scalable machine learning methods. We empirically show how our approach is practical on different natural language processing tasks including dependency parsing, semantic role labeling and sentiment analysis of the Twitter text. For our ongoing and future work, we propose to use a holistic approach to model cross-lingual recurrent representations for many languages and tasks.

      Less More
    • November 6, 2017

      Gary Marcus

      All purpose, all-powerful AI systems, capable of catering to our every intellectual need, have been promised for six decades, but thus far still not arrived. What will it take to bring AI to something like human-level intelligence? And why haven't we gotten there already? Scientist, author, and entrepreneur Gary Marcus (Founder and CEO of Geometric Intelligence, recently acquired by Uber) explains why deep learning is overrated, and what we need to do next to achieve genuine artificial intelligence.

      Less More
    • October 30, 2017

      Arman Cohan

      The rapid growth of scientific literature has created a challenge for researchers to remain current with new developments. Existence of surveys summarizing the latest state of the field shows that such information is desirable, yet obtaining such summaries requires painstaking manual efforts. Scientific document summarization aims at addressing this problem by providing a compact representation of new findings and contributions of the published literature. First, I will present methods for improving text summarization of scientific literature by utilizing citations as an alternative to abstracts. In particular, I will talk about how we can address the problem of potential citation inaccuracy by providing context from the reference to the citations. Utilizing these contexts along with the scientific discourse structure, I will present an effective extractive summarization method for capturing various contributions of the target paper. In addition to the rapid growth of biomedical scientific literature, there is an increasing demand for using health-related text, including clinical notes, patient reports, and social media. I will discuss current challenges in health-care which include medical errors and mental-health. As an attempt to address some of these challenges, I will show how we can make qualitative comparison of errors in clinical care through medical narratives. Further, I will focus on mental-health and discuss our proposed approaches to perform depression and self-harm risk assessment utilizing social media data.

      Less More
    • October 16, 2017

      Chuang Gan

      The increasing ubiquity of devices capable of capturing videos has led to an explosion in the amount of recorded video content. Instead of “eyeballing” the videos for potentially useful information, it has therefore been a pressing need to develop automatic video analysis and understanding algorithms for various applications. However, understanding videos on a large scale remains challenging: large variations and complexities, time-consuming annotations, and a wide range of involved video concepts. In light of these challenges, my research towards video understanding focuses on designing effective network architectures to learn robust video representations, learning video concepts from weak supervision and building a stronger connection between language and vision. In this talk, I will first introduce a Deep Event Network (DevNet) that can simultaneously detect pre-defined events and localize spatial-temporal key evidence. Then I will show how web crawled videos and images could be utilized for learning video concepts. Finally, I will present our recent efforts to connect visual understanding to language through attractive visual captioning and visual question segmentation.

      Less More
    • October 4, 2017

      Oren Etzioni

      Does Artificial Intelligence (AI) research result in threats to society, or will it yield beneficial technology? The talk will address these issues by describing the projects and perspective at the Allen Institute for AI (AI2) in Seattle. AI2's mission is "AI for the Common Good," as exemplified by Semantic Scholar, a search engine that utilizes AI to overcome information overload in scientific search.

      Less More