Menu
Viewing 61-80 of 143 videos See AI2’s full collection of videos on our YouTube channel.
    • May 19, 2017

      Scott Yih

      Building a question answering system to automatically answer natural-language questions is a long-standing research problem. While traditionally unstructured text collections are the main information source for answering questions, the development of large-scale knowledge bases provides new opportunities for open-domain factoid question answering. In this talk, I will present our recent work on semantic parsing, which maps natural language questions to structured queries that can be executed on a graph knowledge base to answer the questions. Our approach defines a query graph that resembles subgraphs of the knowledge base and can be directly mapped to a logical form. With this design, semantic parsing is reduced to query graph generation, formulated as a staged search problem. Compared to existing methods, our solution is conceptually simple and yet outperforms previous state-of-the-art results substantially.

      Less More
    • May 9, 2017

      Luheng He

      Semantic role labeling (SRL) systems aim to recover the predicate-argument structure of a sentence, to determine essentially “who did what to whom”, “when”, and “where”. We introduce a new deep learning model for SRL that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations. We use a deep highway BiLSTM architecture with constrained decoding, while observing a number of recent best practices for initialization and regularization. Our 8-layer ensemble model achieves 83.2 F1 on the CoNLL 2005 test set and 83.4 F1 on CoNLL 2012, roughly a 10% relative error reduction over the previous state of the art. Extensive empirical analysis of these gains show that (1) deep models excel at recovering long-distance dependencies but can still make surprisingly obvious errors, and (2) that there is still room for syntactic parsers to improve these results. These findings suggest directions for future improvements on SRL performance.

      Less More
    • May 8, 2017

      Derry Wijaya

      One of the ways we can formulate natural language understanding is by treating it as a task of mapping natural language text to its meaning representation: entities and relations anchored to the world. Since verbs express relations over their arguments and adjuncts, a lexical resource about verbs can facilitate natural language understanding by mapping verbs to relations over entities expressed by their arguments and adjuncts in the world. In my thesis work, I semi-automatically construct a large scale verb resource called VerbKB that contains some of these mappings for natural language understanding. A verb lexical unit in VerbKB consists of a verb lexeme or a verb lexeme and a preposition e.g., “live”, “live in”, which is typed with a pair of NELL knowledge base semantic categories that indicates its subject type and its object type e.g., “live in”(person, location). In this talk, I will present the algorithms behind VerbKB that will complement existing resources of verbs such as WordNet and VerbNet and existing knowledge bases about entities such as NELL. VerbKB contains two types of mappings: (1) the mappings from verb lexical units to binary relations in knowledge bases (e.g., the mapping from the verb lexical unit “die at”(person, nonNegInteger) to the binary relation personDiedAtAge) and (2) the mappings from verb lexical units to changes in binary relations in knowledge bases (e.g., the mapping from the verb lexical unit “divorce”(person, person) to the termination of the relation hasSpouse). I will present algorithms for these two mappings and how we extend VerbKB to cover relations beyond existing relations in NELL knowledge base. In the spirit of building multilingual lexical resources for NLP, I will also briefly discuss my recent work in building lexical translations for high-resource and low-resource languages from monolingual or comparable corpora.

      Less More
    • May 2, 2017

      Mark Yatskar

      In this talk, we examine the role of language in enabling grounded intelligence. We consider two applications where language can be used as a scaffold for (a) allowing for the quick acquisition of large scale common sense knowledge, and (b) enabling broad coverage recognition of events in images. We present some of the technical challenges with using language based representations for grounding, such as sparsity, and finally present some social challenges, such as amplified gender bias in models trained on language grounding datasets.

      Less More
    • April 19, 2017

      Mohit Iyyer

      Creative language—the sort found in novels, film, and comics—contains a wide range of linguistic phenomena, from phrasal and sentential syntactic complexity to high-level discourse structures such as narrative and character arcs. In this talk, I explore how we can use deep learning to understand, generate, and answer questions about creative language. I begin by presenting deep neural network models for two tasks involving creative language understanding: 1) modeling dynamic relationships between fictional characters in novels, for which our models achieve higher interpretability and accuracy than existing work; and 2) predicting dialogue and artwork from comic book panels, in which we demonstrate that even state-of-the-art deep models struggle on problems that require commonsense reasoning. Next, I introduce deep models that outperform all but the best human players on quiz bowl, a trivia game that contains many questions about creative language. Shifting to ongoing work, I describe a neural language generation method that disentangles the content of a novel (i.e., the information or story it conveys) from the style in which it is written. Finally, I conclude by integrating my work on deep learning, creative language, and question answering into a future research plan to build conversational agents that are both engaging and useful.

      Less More
    • April 18, 2017

      Marti Hearst

      AI2 researchers are making groundbreaking advances in machine interpretation of scientific and educational text and images. In our current research, we are interested in improving educational technology, especially automated and semi-automated guidance systems. In past work, we have been successful in leveraging existing metadata and ontologies to produce highly usable search interfaces, and so in one very new line of work, we are investigating if we can automatically create good practice questions from a preexisting biology ontology. In the first half of this talk, I will describe this very new work, as well as some as yet unexplored goals for future work in this space. AI2 researchers are also producing the world’s best citation search system. In the second half of this talk I will describe some prior NLP and HCI work on analyzing bioscience citation text which might be of interest to the Semantic Scholar team as well as the NLP teams.

      Less More
    • February 20, 2017

      He He Xiy

      The future of virtual assistants, self-driving cars, and smart homes require intelligent agents that work intimately with users. Instead of passively following orders given by users, an interactive agent must actively collaborate with people through communication, coordination, and user-adaptation. In this talk, I will present our recent work towards building agents that interact with humans. First, we propose a symmetric collaborative dialogue setting in which two agents, each with some private knowledge, must communicate in natural language to achieve a common goal. We present a human-human dialogue dataset that poses new challenges to existing models, and propose a neural model with dynamic knowledge graph embedding. Second, we study the user-adaptation problem in quizbowl - a competitive, incremental question-answering game. We show that explicitly modeling of different human behavior leads to more effective policies that exploits sub-optimal players. I will conclude by discussing opportunities and open questions in learning interactive agents.

      Less More
    • February 16, 2017

      Christopher Lin

      Research in artificial intelligence and machine learning (ML) has exploded in the last decade, bringing humanity to the cusp of self-driving cars, digital personal assistants, and unbeatable game-playing robots. My research, which spans the areas of AI, ML, Crowdsourcing, and Natural Language Processing (NLP), focuses on an area where machines are still significantly inferior to humans, despite their super-human intelligence in so many other facets of life: the intelligent management of machine learning (iML), or the ability to reason about what they don’t know so that they may independently and efficiently close gaps in knowledge. iML encompasses many important questions surrounding the ML pipeline, including, but not limited to: 1) How can an agent optimally obtain high-quality labels? 2) How can an agent that is trying to learn a new concept sift through all the unlabeled examples that exist in the world to identify exemplary subsets that would make good training and test sets? An agent must be able to identify examples that are positive for that concept. Learning is extremely expensive, if not impossible, if one cannot find representative examples. 3) Given a fixed budget, should an agent try to obtain a large but noisy training set, or a small but clean one? How can an agent achieve more cost-effective learning by carefully considering this tradeoff? In this talk, I will go into depth on the third question. I will first discuss properties of learning problems that affect this tradeoff. Then I will introduce re-active learning, a generalization of active learning that allows for the relabeling of existing examples, and show why traditional active learning algorithms don't work well for re-active learning. Finally, I will introduce new algorithms for re-active learning and show that they perform well on several domains.

      Less More
    • February 13, 2017

      Wenpeng Yin

      Wenpeng's talk mainly covers his work developing state-of-the-art deep neural networks to learn representations for different granularity of language units including single words, phrases, sentences, documents and knowledge graphs (KG). Specifically, he tries to deal with these questions: (a) So many pre-trained word embeddings, is there an upper bound? What is the cheapest way to get higher-quality word embeddings? -- More training data? More advanced algorithm/objective function? (b) How to learn representations for phrases which appear continuous as well as discontinuous? How to derive representations for phrases of arbitrary lengths? (c) How to learn sentence representations in supervised, in unsupervised or in context constraints? (d) Given a question, how to distill the document so that its representation is specific to the question? (e) In knowledge graphs such as Freebase, how to model the paths of arbitrary lengths to solve some knowledge graph reasoning problems. These research problems are evaluated on word/phrase similarity, paraphrase identification, question answering, KG reasoning tasks etc.

      Less More
    • January 25, 2017

      Hal Daume

      Machine learning-based natural language processing systems are amazingly effective, when plentiful labeled training data exists for the task/domain of interest. Unfortunately, for broad coverage (both in task and domain) language understanding, we're unlikely to ever have sufficient labeled data, and systems must find some other way to learn. I'll describe a novel algorithm for learning from interactions, and several problems of interest, most notably machine simultaneous interpretation (translation while someone is still speaking). This is all joint work with some amazing (former) students He He, Alvin Grissom II, John Morgan, Mohit Iyyer, Sudha Rao and Leonardo Claudino, as well as colleagues Jordan Boyd-Graber, Kai-Wei Chang, John Langford, Akshay Krishnamurthy, Alekh Agarwal, Stéphane Ross, Alina Beygelzimer and Paul Mineiro.

      Less More
    • January 18, 2017

      Zhou Yu

      Communication is an intricate dance, an ensemble of coordinated individual actions. Imagine a future where machines interact with us like humans, waking us up in the morning, navigating us to work, or discussing our daily schedules in a coordinated and natural manner. Current interactive systems being developed by Apple, Google, Microsoft, and Amazon attempt to reach this goal by combining a large set of single-task systems. But products like Siri, Google Now, Cortana and Echo still follow pre-specified agendas that cannot transition between tasks smoothly and track and adapt to different users naturally. My research draws on recent developments in speech and natural language processing, human-computer interaction, and machine learning to work towards the goal of developing situated intelligent interactive systems. These systems can coordinate with users to achieve effective and natural interactions. I have successfully applied the proposed concepts to various tasks, such as social conversation, job interview training and movie promotion. My team's proposal on engaging social conversation systems was selected to receive $100,000 from Amazon Inc. to compete in the Amazon Alexa Prize Challenge.

      Less More
    • November 19, 2016

      Oren Etzioni

      Artificial Intelligence advocate Oren Etzioni makes a case for the life-saving benefits of AI used wisely to improve our way of life. Acknowledging growing fears about AI’s potential for abuse of power, he asks us to consider how to responsibly balance our desire for greater intelligence and autonomy with the risks inherent in this new and growing technology. Less

      Less More
    • November 8, 2016

      Manohar Pulari

      Over the past 5 years the community has made significant strides in the field of Computer Vision. Thanks to large scale datasets, specialized computing in form of GPUs and many breakthroughs in modeling better convnet architectures Computer Vision systems in the wild at scale are becoming a reality. At Facebook AI Research we want to embark on the journey of making breakthroughs in the field of AI and using them for the benefit of connecting people and helping remove barriers for communication. In that regard Computer Vision plays a significant role as the media content coming to Facebook is ever increasing and building models that understand this content is crucial in achieving our mission of connecting everyone. In this talk I will gloss over how we think about problems related to Computer Vision at Facebook and touch various aspects related to supervised, semi-supervised, unsupervised learning. I will jump between various research efforts involving representation learning. I will also highlight some large scale applications and talk about limitations of current systems and how we are planning to tackle them. Less

      Less More
    • October 18, 2016

      Kun Xu

      As very large structured knowledge bases have become available, answering natural language questions over structured knowledge facts has attracted increasing research efforts. We tackle this task in a pipeline paradigm, that is, recognizing users’ query intention and mapping the involved semantic items against a given knowledge base (KB). we propose an efficient pipeline framework to model a user’s query intention as a phrase level dependency DAG which is then instantiated regarding a specific KB to construct the final structured query. Our model benefits from the efficiency of structured prediction models and the separation of KB-independent and KB-related modelings. The most challenging problem in the structure instantiation is to ground the relational phrases to KB predicates which essentially can be treated as a relation classification (RE) task. To learn a robust and generalized representation of the relation, we propose a multi-channel convolutional neural network which works on the shortest dependency path. Furthermore, we introduce a negative sampling strategy to learn the assignment of subjects and objects of a relation. Less

      Less More
    • October 18, 2016

      Jacob Andreas

      Language understanding depends on two abilities: an ability to translate between natural language utterances and abstract representations of meaning, and an ability to relate these meaning representations to the world. In the natural language processing literature, these tasks are respectively known as "semantic parsing" and "grounding", and have been treated as essentially independent problems. In this talk, I will present two modular neural architectures for jointly learning to ground language in the world and reason about it compositionally. I will first describe a technique that uses syntactic information to dynamically construct neural networks from composable primitives. The resulting structures, called "neural module networks", can be used to achieve state-of-the-art results on a variety of grounded question answering tasks. Next, I will present a model for contextual referring expression generation, in which contrastive behavior results from a combination of learned semantics and inference-driven pragmatics. This model is again backed by modular neural components---in this case elementary listener and speaker representations. It is able to successfully complete a challenging referring expression generation task, exhibiting pragmatic behavior without ever observing such behavior at training time.

      Less More
    • September 29, 2016

      Karthik Narasimhan

      In this talk, I will describe two approaches to learning natural language semantics using reward-based feedback. This is in contrast to many NLP approaches that rely on large amounts of supervision, which is often expensive and difficult to obtain. First, I will describe a framework utilizing reinforcement learning to improve information extraction (IE). Our approach identifies alternative sources of information by querying the web, extracting from new sources, and reconciling the extracted values until sufficient evidence is collected. Our experiments on two datasets -- shooting incidents and food adulteration cases -- demonstrate that our system significantly outperforms traditional extractors and a competitive meta-classifier baseline. Second, I will talk about learning control policies for text-based games where an agent needs to understand natural language to operate effectively in a virtual environment. We employ a deep reinforcement learning framework to jointly learn state representations and action policies using game rewards as feedback, capturing semantics of the game states in the process.

      Less More
    • September 26, 2016

      Shobeir Fakhraei

      Our world is becoming increasingly connected, and so is the data collected from it. To represent, reason about, and model the real-world data, it is essential to develop computational models capable of representing the underlying network structures and their characteristics. Domains such as scholarly networks, biology, online social networks, the World Wide Web and information networks, and recommender systems are just a few examples that include explicit or implicit network structures. I have studied and developed computational models for representing and reasoning about rich, heterogeneous, and interlinked data that span over feature-based and embedding-based approaches to statistical relational methods that more explicitly model dependencies between interconnected entities. In this talk, I will discuss different methods of modeling node classification and link inference on networks in several domains, and highlight two important aspects: (1) Heterogeneous entities and multi-relational structures, (2) joint inference and collective classification of the unlabeled data. I will also introduce our model for link inference that serves as a template to encode a variety of information such as structural, biological, social, contextual interactions in different domains.

      Less More
    • September 19, 2016

      Anna Rohrbach

      In recent years many challenging problems have emerged in the field of language and vision. Frequently the only form of available annotation is the natural language sentence associated with an image or video. How can we address complex tasks like automatic video description or visual grounding of textual phrases with these weak and noisy annotations? In my talk I will first present our pioneering work on automatic movie description. We collected a large scale dataset and proposed an approach to learn visual semantic concepts from weak sentence annotations. I will then talk about our approach to grounding arbitrary language phrases in images. It is able to operate in un- and semi-supervised settings (with respect to the localization annotations) by learning to reconstruct the input phrase.

      Less More
    • September 13, 2016

      Ajay Nagesh

      Information Extraction has become an indispensable tool in our quest to handle the data deluge of the information age. In this talk, we discuss the categorization of complex relational features and outline methods to learn feature combinations through induction. We demonstrate the efficacy of induction techniques in learning rules for the identification of named entities in text – the novelty being the application of induction techniques to learn in a very expressive declarative rule language. Next, we discuss our investigations in the paradigm of distant supervision, which facilitates the creation of large albeit noisy training data. We devise an inference framework in which constraints can be easily specified in learning relation extractors. We reformulate the learning objective in a max-margin framework. To the best of our knowledge, our formulation is the first to optimize multi-variate non-linear performance measures such as F1 for a latent variable structure prediction task. Towards the end, we will briefly touch upon some recent exploratory work to leverage matrix completion methods and novel embedding techniques for predicting a richer fine-grained set of entity types to help in downstream applications such as Relation Extraction and Question Answering.

      Less More
    • September 7, 2016

      Siva Reddy

      I will present three semantic parsing approaches for querying Freebase in natural language 1) training only on raw web corpus, 2) training on question-answer (QA) pairs, and 3) training on both QA pairs and web corpus. For 1 and 2, we conceptualise semantic parsing as a graph matching problem, where natural language graphs built using CCG/dependency logical forms are transduced to Freebase graphs. For 3, I will present a natural-logic approach for Semantic Parsing. Our methods achieve state-of-the-art on WebQuestions and Free917 QA datasets.

      Less More