Menu
Viewing 26 videos from 2017 See AI2’s full collection of videos on our YouTube channel.
    • November 21, 2017

      Danqi Chen

      Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved, goal of NLP. This task of reading comprehension (i.e., question answering over a passage of text) has received a resurgence of interest, due to the creation of large-scale datasets and well-designed neural network models.

      Less More
    • November 20, 2017

      Jacob Walker

      Understanding the temporal dimension of images is a fundamental part of computer vision. Humans are able to interpret how the entities in an image will change over time. However, it has only been relatively recently that researchers have focused on visual forecasting—getting machines to anticipate events in the visual world before they actually happen. This aspect of vision has many practical implications in tasks ranging from human-computer interaction to anomaly detection. In addition, temporal prediction can serve as a task for representation learning, useful for various other recognition problems.

      Less More
    • November 17, 2017

      Sun Kim

      PubMed is a biomedical literature search engine, hosting more than 27 million bibliographic records. With the abundance and diversity of information in PubMed, many queries retrieve thousands of documents, making it difficult for users to identify the information relevant to their topic of interest. Unlike more general domains, the language of biomedicine uses abundant technical jargon to describe scientific discoveries and applications. To understand the semantics of biomedical text, it is important to identify not only the meanings of individual words, but also of multi-word phrases appearing in text. Controlled vocabularies may help, but the rapid growth of PubMed makes it hard to keep up with the new information.

      Less More
    • November 7, 2017

      Mohammad Rasooli

      Transfer methods have been shown to be effective alternatives for developing accurate natural language processing systems in the absence of annotated data in the target language of interest. They are divided into two approaches: 1) annotation projection from translation data using supervised models in resource-rich languages; and 2) direct transfer from resource-rich annotated datasets. In this talk, we review our past work on improving over both of the approaches by applying scalable machine learning methods. We empirically show how our approach is practical on different natural language processing tasks including dependency parsing, semantic role labeling and sentiment analysis of the Twitter text. For our ongoing and future work, we propose to use a holistic approach to model cross-lingual recurrent representations for many languages and tasks.

      Less More
    • November 6, 2017

      Gary Marcus

      All purpose, all-powerful AI systems, capable of catering to our every intellectual need, have been promised for six decades, but thus far still not arrived. What will it take to bring AI to something like human-level intelligence? And why haven't we gotten there already? Scientist, author, and entrepreneur Gary Marcus (Founder and CEO of Geometric Intelligence, recently acquired by Uber) explains why deep learning is overrated, and what we need to do next to achieve genuine artificial intelligence.

      Less More
    • October 30, 2017

      Arman Cohan

      The rapid growth of scientific literature has created a challenge for researchers to remain current with new developments. Existence of surveys summarizing the latest state of the field shows that such information is desirable, yet obtaining such summaries requires painstaking manual efforts. Scientific document summarization aims at addressing this problem by providing a compact representation of new findings and contributions of the published literature. First, I will present methods for improving text summarization of scientific literature by utilizing citations as an alternative to abstracts. In particular, I will talk about how we can address the problem of potential citation inaccuracy by providing context from the reference to the citations. Utilizing these contexts along with the scientific discourse structure, I will present an effective extractive summarization method for capturing various contributions of the target paper. In addition to the rapid growth of biomedical scientific literature, there is an increasing demand for using health-related text, including clinical notes, patient reports, and social media. I will discuss current challenges in health-care which include medical errors and mental-health. As an attempt to address some of these challenges, I will show how we can make qualitative comparison of errors in clinical care through medical narratives. Further, I will focus on mental-health and discuss our proposed approaches to perform depression and self-harm risk assessment utilizing social media data.

      Less More
    • October 16, 2017

      Chuang Gan

      The increasing ubiquity of devices capable of capturing videos has led to an explosion in the amount of recorded video content. Instead of “eyeballing” the videos for potentially useful information, it has therefore been a pressing need to develop automatic video analysis and understanding algorithms for various applications. However, understanding videos on a large scale remains challenging: large variations and complexities, time-consuming annotations, and a wide range of involved video concepts. In light of these challenges, my research towards video understanding focuses on designing effective network architectures to learn robust video representations, learning video concepts from weak supervision and building a stronger connection between language and vision. In this talk, I will first introduce a Deep Event Network (DevNet) that can simultaneously detect pre-defined events and localize spatial-temporal key evidence. Then I will show how web crawled videos and images could be utilized for learning video concepts. Finally, I will present our recent efforts to connect visual understanding to language through attractive visual captioning and visual question segmentation.

      Less More
    • October 4, 2017

      Oren Etzioni

      Does Artificial Intelligence (AI) research result in threats to society, or will it yield beneficial technology? The talk will address these issues by describing the projects and perspective at the Allen Institute for AI (AI2) in Seattle. AI2's mission is "AI for the Common Good," as exemplified by Semantic Scholar, a search engine that utilizes AI to overcome information overload in scientific search.

      Less More
    • September 15, 2017

      Horacio Saggion

      In the current online Open Science context, scientific data-sets and tools for deep text analysis, visualization and exploitation play a major role. I will present a system developed over the past three years for “deep” analysis and annotation of scientific text collections. After a brief overview of the system and its main components, I will present our current work on the development of a bi-lingual (Spanish and English) fully annotated text resource in the field of natural language processing that we have created with our system. Moreover, a faceted-search and visualization system to explore the created resource will be also discussed.

      Less More
    • August 16, 2017

      Leo Boytsov

      We explore alternatives to classic term-based retrieval. The ultimate objective is to develop a smarter candidate generation component for question answering (QA) and information retrieval (IR), which can employ similarities that are more expressive than the commonly used TF-IDF ranking function. Achieving this objective requires solving two subproblems: designing simple yet effective similarity functions and developing efficient solutions for k-NN search.

      Less More
    • August 9, 2017

      Gabi Stanovsky

      Propositions are statements for which a truth value can be assigned (e.g., “Bob loves Mary”). Since they constitute the primary unit of information conveyed in texts, proposition extraction is often used in NLP algorithms such as question answering, summarization, or recognizing textual entailment. I will begin the talk with an overview of my research, which revolves around the different aspects of proposition extraction: from formalizing requirements and evaluation metrics, through annotation and crowdsourcing techniques, to modeling and automatic prediction. I will then describe two concrete research efforts which exemplify these aspects, while making use of the recent QA-SRL paradigm.

      Less More
    • July 25, 2017

      Oren Etzioni

      This video discusses the paper: Moving Beyond the Turing Test with the Allen AI Science Challenge. The field of Artificial Intelligence has made great strides forward recently, for example AlphaGo's recent victory against the world champion Lee Sedol in the game of Go, leading to great optimism about the field. But are we really moving towards smarter machines, or are these successes restricted to certain classes of problems, leaving other challenges untouched? In 2016, the Allen Institute for Artificial Intelligence (AI2) ran the Allen AI Science Challenge, a competition to test machines on an ostensibly difficult task, namely answering 8th Grade science questions. Our motivations were to encourage the field to set its sights broader and higher by exploring a problem that appears to require modeling, reasoning, language understanding, and commonsense knowledge, to probe the state of the art on this task, and sow the seeds for possible future breakthroughs. The challenge received a strong response, with 780 teams from all over the world participating. What were the results? This article describes the competition and the interesting outcomes of the challenge.

      Less More
    • June 22, 2017

      Arvind Neelakantan

      Knowledge representation and reasoning is one of the central challenges of artificial intelligence, and has important implications in many fields including natural language understanding and robotics. Representing knowledge with symbols, and reasoning via search and logic has been the dominant paradigm for many decades. In this work, we use deep neural networks to learn to both represent symbols and perform reasoning end-to-end from data. By learning powerful non-linear models, our approach generalizes to massive amounts of knowledge and works well with messy real-world data using minimal human effort. First, we show that recurrent neural networks with an attention mechanism achieve state-of-the-art reasoning on a large structured knowledge graph. Next, we develop Neural Programmer, a neural network augmented with discrete operations that can be learned to induce latent programs with backpropagation. We apply Neural Programmer to induce short programs on a natural language question answering dataset that requires reasoning on semi-structured Wikipedia tables. We present what is to our awareness the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset. Unlike previous learning approaches to program induction, the model does not require domain-specific grammars, rules, or annotations.

      Less More
    • June 13, 2017

      Oren Etzioni

      As computer automations is upon us and many jobs will change or be replaced by AIs, AI optimist Oren Etzioni, CEO, Allen Institute for AI, describes the social impacts we must consider as he paints a possible euphonic future state in which jobs will be more creative and fulfilling. About XPRIZE: XPRIZE is an educational (501c3) nonprofit organization whose mission is to bring about radical breakthroughs for the benefit of humanity, thereby inspiring the formation of new industries and the revitalization of markets that are currently stuck due to existing failures or a commonly held belief that a solution is not possible. XPRIZE addresses the world's Grand Challenges by creating and managing large-scale, high-profile, incentivized prize competitions that stimulate investment in research and development worth far more than the prize itself. It motivates and inspires brilliant innovators from all disciplines to leverage their intellectual and financial capital.

      Less More
    • May 22, 2017

      Abhinav Gupta

      In 2013, we proposed NEIL (Never Ending Image Learner), a computer program to learn visual models and commonsense knowledge from the web. In its first version, NEIL ran for 2.5 years learning 8K concepts, labeling 4.5M images and learning 20K common-sense facts. But it also helped us discover the shortcomings of the current paradigm of learning and reasoning with knowledge. In this talk, I am going to describe our subsequent efforts to overcome these drawbacks.

      On the learning side, I will talk about how we scale up learning visual models to rare and compositional categories (“wet possum”). Note the web-search data for compositional categories are noisy and cannot be used “as is” for learning. The core problem in compositional categories is respecting contextuality. The meaning of primitive categories change based on concepts being composed with (red in red wine is different from red in red car). I will talk about how we can respect contextuality while composing categories.

      On the reasoning side, I will talk about how we can incorporate the learned knowledge graphs in end-to-end learning. Specifically, we will show how these “noisy” knowledge graphs can not only improve classification performance but also provide “explainability” which is crucial for AI systems. I will also show some of our recent work on using knowledge graphs for zero-shot learning (again in an end-to-end manner).

      Less More
    • May 19, 2017

      Scott Yih

      Building a question answering system to automatically answer natural-language questions is a long-standing research problem. While traditionally unstructured text collections are the main information source for answering questions, the development of large-scale knowledge bases provides new opportunities for open-domain factoid question answering. In this talk, I will present our recent work on semantic parsing, which maps natural language questions to structured queries that can be executed on a graph knowledge base to answer the questions. Our approach defines a query graph that resembles subgraphs of the knowledge base and can be directly mapped to a logical form. With this design, semantic parsing is reduced to query graph generation, formulated as a staged search problem. Compared to existing methods, our solution is conceptually simple and yet outperforms previous state-of-the-art results substantially.

      Less More
    • May 9, 2017

      Luheng He

      Semantic role labeling (SRL) systems aim to recover the predicate-argument structure of a sentence, to determine essentially “who did what to whom”, “when”, and “where”. We introduce a new deep learning model for SRL that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations. We use a deep highway BiLSTM architecture with constrained decoding, while observing a number of recent best practices for initialization and regularization. Our 8-layer ensemble model achieves 83.2 F1 on the CoNLL 2005 test set and 83.4 F1 on CoNLL 2012, roughly a 10% relative error reduction over the previous state of the art. Extensive empirical analysis of these gains show that (1) deep models excel at recovering long-distance dependencies but can still make surprisingly obvious errors, and (2) that there is still room for syntactic parsers to improve these results. These findings suggest directions for future improvements on SRL performance.

      Less More
    • May 8, 2017

      Derry Wijaya

      One of the ways we can formulate natural language understanding is by treating it as a task of mapping natural language text to its meaning representation: entities and relations anchored to the world. Since verbs express relations over their arguments and adjuncts, a lexical resource about verbs can facilitate natural language understanding by mapping verbs to relations over entities expressed by their arguments and adjuncts in the world. In my thesis work, I semi-automatically construct a large scale verb resource called VerbKB that contains some of these mappings for natural language understanding. A verb lexical unit in VerbKB consists of a verb lexeme or a verb lexeme and a preposition e.g., “live”, “live in”, which is typed with a pair of NELL knowledge base semantic categories that indicates its subject type and its object type e.g., “live in”(person, location). In this talk, I will present the algorithms behind VerbKB that will complement existing resources of verbs such as WordNet and VerbNet and existing knowledge bases about entities such as NELL. VerbKB contains two types of mappings: (1) the mappings from verb lexical units to binary relations in knowledge bases (e.g., the mapping from the verb lexical unit “die at”(person, nonNegInteger) to the binary relation personDiedAtAge) and (2) the mappings from verb lexical units to changes in binary relations in knowledge bases (e.g., the mapping from the verb lexical unit “divorce”(person, person) to the termination of the relation hasSpouse). I will present algorithms for these two mappings and how we extend VerbKB to cover relations beyond existing relations in NELL knowledge base. In the spirit of building multilingual lexical resources for NLP, I will also briefly discuss my recent work in building lexical translations for high-resource and low-resource languages from monolingual or comparable corpora.

      Less More
    • May 2, 2017

      Mark Yatskar

      In this talk, we examine the role of language in enabling grounded intelligence. We consider two applications where language can be used as a scaffold for (a) allowing for the quick acquisition of large scale common sense knowledge, and (b) enabling broad coverage recognition of events in images. We present some of the technical challenges with using language based representations for grounding, such as sparsity, and finally present some social challenges, such as amplified gender bias in models trained on language grounding datasets.

      Less More
    • April 19, 2017

      Mohit Iyyer

      Creative language—the sort found in novels, film, and comics—contains a wide range of linguistic phenomena, from phrasal and sentential syntactic complexity to high-level discourse structures such as narrative and character arcs. In this talk, I explore how we can use deep learning to understand, generate, and answer questions about creative language. I begin by presenting deep neural network models for two tasks involving creative language understanding: 1) modeling dynamic relationships between fictional characters in novels, for which our models achieve higher interpretability and accuracy than existing work; and 2) predicting dialogue and artwork from comic book panels, in which we demonstrate that even state-of-the-art deep models struggle on problems that require commonsense reasoning. Next, I introduce deep models that outperform all but the best human players on quiz bowl, a trivia game that contains many questions about creative language. Shifting to ongoing work, I describe a neural language generation method that disentangles the content of a novel (i.e., the information or story it conveys) from the style in which it is written. Finally, I conclude by integrating my work on deep learning, creative language, and question answering into a future research plan to build conversational agents that are both engaging and useful.

      Less More
    • April 18, 2017

      Marti Hearst

      AI2 researchers are making groundbreaking advances in machine interpretation of scientific and educational text and images. In our current research, we are interested in improving educational technology, especially automated and semi-automated guidance systems. In past work, we have been successful in leveraging existing metadata and ontologies to produce highly usable search interfaces, and so in one very new line of work, we are investigating if we can automatically create good practice questions from a preexisting biology ontology. In the first half of this talk, I will describe this very new work, as well as some as yet unexplored goals for future work in this space. AI2 researchers are also producing the world’s best citation search system. In the second half of this talk I will describe some prior NLP and HCI work on analyzing bioscience citation text which might be of interest to the Semantic Scholar team as well as the NLP teams.

      Less More
    • February 20, 2017

      He He Xiy

      The future of virtual assistants, self-driving cars, and smart homes require intelligent agents that work intimately with users. Instead of passively following orders given by users, an interactive agent must actively collaborate with people through communication, coordination, and user-adaptation. In this talk, I will present our recent work towards building agents that interact with humans. First, we propose a symmetric collaborative dialogue setting in which two agents, each with some private knowledge, must communicate in natural language to achieve a common goal. We present a human-human dialogue dataset that poses new challenges to existing models, and propose a neural model with dynamic knowledge graph embedding. Second, we study the user-adaptation problem in quizbowl - a competitive, incremental question-answering game. We show that explicitly modeling of different human behavior leads to more effective policies that exploits sub-optimal players. I will conclude by discussing opportunities and open questions in learning interactive agents.

      Less More
    • February 16, 2017

      Christopher Lin

      Research in artificial intelligence and machine learning (ML) has exploded in the last decade, bringing humanity to the cusp of self-driving cars, digital personal assistants, and unbeatable game-playing robots. My research, which spans the areas of AI, ML, Crowdsourcing, and Natural Language Processing (NLP), focuses on an area where machines are still significantly inferior to humans, despite their super-human intelligence in so many other facets of life: the intelligent management of machine learning (iML), or the ability to reason about what they don’t know so that they may independently and efficiently close gaps in knowledge. iML encompasses many important questions surrounding the ML pipeline, including, but not limited to: 1) How can an agent optimally obtain high-quality labels? 2) How can an agent that is trying to learn a new concept sift through all the unlabeled examples that exist in the world to identify exemplary subsets that would make good training and test sets? An agent must be able to identify examples that are positive for that concept. Learning is extremely expensive, if not impossible, if one cannot find representative examples. 3) Given a fixed budget, should an agent try to obtain a large but noisy training set, or a small but clean one? How can an agent achieve more cost-effective learning by carefully considering this tradeoff? In this talk, I will go into depth on the third question. I will first discuss properties of learning problems that affect this tradeoff. Then I will introduce re-active learning, a generalization of active learning that allows for the relabeling of existing examples, and show why traditional active learning algorithms don't work well for re-active learning. Finally, I will introduce new algorithms for re-active learning and show that they perform well on several domains.

      Less More
    • February 13, 2017

      Wenpeng Yin

      Wenpeng's talk mainly covers his work developing state-of-the-art deep neural networks to learn representations for different granularity of language units including single words, phrases, sentences, documents and knowledge graphs (KG). Specifically, he tries to deal with these questions: (a) So many pre-trained word embeddings, is there an upper bound? What is the cheapest way to get higher-quality word embeddings? -- More training data? More advanced algorithm/objective function? (b) How to learn representations for phrases which appear continuous as well as discontinuous? How to derive representations for phrases of arbitrary lengths? (c) How to learn sentence representations in supervised, in unsupervised or in context constraints? (d) Given a question, how to distill the document so that its representation is specific to the question? (e) In knowledge graphs such as Freebase, how to model the paths of arbitrary lengths to solve some knowledge graph reasoning problems. These research problems are evaluated on word/phrase similarity, paraphrase identification, question answering, KG reasoning tasks etc.

      Less More
    • January 25, 2017

      Hal Daume

      Machine learning-based natural language processing systems are amazingly effective, when plentiful labeled training data exists for the task/domain of interest. Unfortunately, for broad coverage (both in task and domain) language understanding, we're unlikely to ever have sufficient labeled data, and systems must find some other way to learn. I'll describe a novel algorithm for learning from interactions, and several problems of interest, most notably machine simultaneous interpretation (translation while someone is still speaking). This is all joint work with some amazing (former) students He He, Alvin Grissom II, John Morgan, Mohit Iyyer, Sudha Rao and Leonardo Claudino, as well as colleagues Jordan Boyd-Graber, Kai-Wei Chang, John Langford, Akshay Krishnamurthy, Alekh Agarwal, Stéphane Ross, Alina Beygelzimer and Paul Mineiro.

      Less More
    • January 18, 2017

      Zhou Yu

      Communication is an intricate dance, an ensemble of coordinated individual actions. Imagine a future where machines interact with us like humans, waking us up in the morning, navigating us to work, or discussing our daily schedules in a coordinated and natural manner. Current interactive systems being developed by Apple, Google, Microsoft, and Amazon attempt to reach this goal by combining a large set of single-task systems. But products like Siri, Google Now, Cortana and Echo still follow pre-specified agendas that cannot transition between tasks smoothly and track and adapt to different users naturally. My research draws on recent developments in speech and natural language processing, human-computer interaction, and machine learning to work towards the goal of developing situated intelligent interactive systems. These systems can coordinate with users to achieve effective and natural interactions. I have successfully applied the proposed concepts to various tasks, such as social conversation, job interview training and movie promotion. My team's proposal on engaging social conversation systems was selected to receive $100,000 from Amazon Inc. to compete in the Amazon Alexa Prize Challenge.

      Less More