Viewing 81-100 of 156 videos See AI2’s full collection of videos on our YouTube channel.
    • February 13, 2017

      Wenpeng Yin

      Wenpeng's talk mainly covers his work developing state-of-the-art deep neural networks to learn representations for different granularity of language units including single words, phrases, sentences, documents and knowledge graphs (KG). Specifically, he tries to deal with these questions: (a) So many pre-trained word embeddings, is there an upper bound? What is the cheapest way to get higher-quality word embeddings? -- More training data? More advanced algorithm/objective function? (b) How to learn representations for phrases which appear continuous as well as discontinuous? How to derive representations for phrases of arbitrary lengths? (c) How to learn sentence representations in supervised, in unsupervised or in context constraints? (d) Given a question, how to distill the document so that its representation is specific to the question? (e) In knowledge graphs such as Freebase, how to model the paths of arbitrary lengths to solve some knowledge graph reasoning problems. These research problems are evaluated on word/phrase similarity, paraphrase identification, question answering, KG reasoning tasks etc.

      Less More
    • January 25, 2017

      Hal Daume

      Machine learning-based natural language processing systems are amazingly effective, when plentiful labeled training data exists for the task/domain of interest. Unfortunately, for broad coverage (both in task and domain) language understanding, we're unlikely to ever have sufficient labeled data, and systems must find some other way to learn. I'll describe a novel algorithm for learning from interactions, and several problems of interest, most notably machine simultaneous interpretation (translation while someone is still speaking). This is all joint work with some amazing (former) students He He, Alvin Grissom II, John Morgan, Mohit Iyyer, Sudha Rao and Leonardo Claudino, as well as colleagues Jordan Boyd-Graber, Kai-Wei Chang, John Langford, Akshay Krishnamurthy, Alekh Agarwal, Stéphane Ross, Alina Beygelzimer and Paul Mineiro.

      Less More
    • January 18, 2017

      Zhou Yu

      Communication is an intricate dance, an ensemble of coordinated individual actions. Imagine a future where machines interact with us like humans, waking us up in the morning, navigating us to work, or discussing our daily schedules in a coordinated and natural manner. Current interactive systems being developed by Apple, Google, Microsoft, and Amazon attempt to reach this goal by combining a large set of single-task systems. But products like Siri, Google Now, Cortana and Echo still follow pre-specified agendas that cannot transition between tasks smoothly and track and adapt to different users naturally. My research draws on recent developments in speech and natural language processing, human-computer interaction, and machine learning to work towards the goal of developing situated intelligent interactive systems. These systems can coordinate with users to achieve effective and natural interactions. I have successfully applied the proposed concepts to various tasks, such as social conversation, job interview training and movie promotion. My team's proposal on engaging social conversation systems was selected to receive $100,000 from Amazon Inc. to compete in the Amazon Alexa Prize Challenge.

      Less More
    • November 19, 2016

      Oren Etzioni

      Artificial Intelligence advocate Oren Etzioni makes a case for the life-saving benefits of AI used wisely to improve our way of life. Acknowledging growing fears about AI’s potential for abuse of power, he asks us to consider how to responsibly balance our desire for greater intelligence and autonomy with the risks inherent in this new and growing technology. Less

      Less More
    • November 8, 2016

      Manohar Pulari

      Over the past 5 years the community has made significant strides in the field of Computer Vision. Thanks to large scale datasets, specialized computing in form of GPUs and many breakthroughs in modeling better convnet architectures Computer Vision systems in the wild at scale are becoming a reality. At Facebook AI Research we want to embark on the journey of making breakthroughs in the field of AI and using them for the benefit of connecting people and helping remove barriers for communication. In that regard Computer Vision plays a significant role as the media content coming to Facebook is ever increasing and building models that understand this content is crucial in achieving our mission of connecting everyone. In this talk I will gloss over how we think about problems related to Computer Vision at Facebook and touch various aspects related to supervised, semi-supervised, unsupervised learning. I will jump between various research efforts involving representation learning. I will also highlight some large scale applications and talk about limitations of current systems and how we are planning to tackle them. Less

      Less More
    • October 18, 2016

      Kun Xu

      As very large structured knowledge bases have become available, answering natural language questions over structured knowledge facts has attracted increasing research efforts. We tackle this task in a pipeline paradigm, that is, recognizing users’ query intention and mapping the involved semantic items against a given knowledge base (KB). we propose an efficient pipeline framework to model a user’s query intention as a phrase level dependency DAG which is then instantiated regarding a specific KB to construct the final structured query. Our model benefits from the efficiency of structured prediction models and the separation of KB-independent and KB-related modelings. The most challenging problem in the structure instantiation is to ground the relational phrases to KB predicates which essentially can be treated as a relation classification (RE) task. To learn a robust and generalized representation of the relation, we propose a multi-channel convolutional neural network which works on the shortest dependency path. Furthermore, we introduce a negative sampling strategy to learn the assignment of subjects and objects of a relation. Less

      Less More
    • October 18, 2016

      Jacob Andreas

      Language understanding depends on two abilities: an ability to translate between natural language utterances and abstract representations of meaning, and an ability to relate these meaning representations to the world. In the natural language processing literature, these tasks are respectively known as "semantic parsing" and "grounding", and have been treated as essentially independent problems. In this talk, I will present two modular neural architectures for jointly learning to ground language in the world and reason about it compositionally. I will first describe a technique that uses syntactic information to dynamically construct neural networks from composable primitives. The resulting structures, called "neural module networks", can be used to achieve state-of-the-art results on a variety of grounded question answering tasks. Next, I will present a model for contextual referring expression generation, in which contrastive behavior results from a combination of learned semantics and inference-driven pragmatics. This model is again backed by modular neural components---in this case elementary listener and speaker representations. It is able to successfully complete a challenging referring expression generation task, exhibiting pragmatic behavior without ever observing such behavior at training time.

      Less More
    • September 29, 2016

      Karthik Narasimhan

      In this talk, I will describe two approaches to learning natural language semantics using reward-based feedback. This is in contrast to many NLP approaches that rely on large amounts of supervision, which is often expensive and difficult to obtain. First, I will describe a framework utilizing reinforcement learning to improve information extraction (IE). Our approach identifies alternative sources of information by querying the web, extracting from new sources, and reconciling the extracted values until sufficient evidence is collected. Our experiments on two datasets -- shooting incidents and food adulteration cases -- demonstrate that our system significantly outperforms traditional extractors and a competitive meta-classifier baseline. Second, I will talk about learning control policies for text-based games where an agent needs to understand natural language to operate effectively in a virtual environment. We employ a deep reinforcement learning framework to jointly learn state representations and action policies using game rewards as feedback, capturing semantics of the game states in the process.

      Less More
    • September 26, 2016

      Shobeir Fakhraei

      Our world is becoming increasingly connected, and so is the data collected from it. To represent, reason about, and model the real-world data, it is essential to develop computational models capable of representing the underlying network structures and their characteristics. Domains such as scholarly networks, biology, online social networks, the World Wide Web and information networks, and recommender systems are just a few examples that include explicit or implicit network structures. I have studied and developed computational models for representing and reasoning about rich, heterogeneous, and interlinked data that span over feature-based and embedding-based approaches to statistical relational methods that more explicitly model dependencies between interconnected entities. In this talk, I will discuss different methods of modeling node classification and link inference on networks in several domains, and highlight two important aspects: (1) Heterogeneous entities and multi-relational structures, (2) joint inference and collective classification of the unlabeled data. I will also introduce our model for link inference that serves as a template to encode a variety of information such as structural, biological, social, contextual interactions in different domains.

      Less More
    • September 19, 2016

      Anna Rohrbach

      In recent years many challenging problems have emerged in the field of language and vision. Frequently the only form of available annotation is the natural language sentence associated with an image or video. How can we address complex tasks like automatic video description or visual grounding of textual phrases with these weak and noisy annotations? In my talk I will first present our pioneering work on automatic movie description. We collected a large scale dataset and proposed an approach to learn visual semantic concepts from weak sentence annotations. I will then talk about our approach to grounding arbitrary language phrases in images. It is able to operate in un- and semi-supervised settings (with respect to the localization annotations) by learning to reconstruct the input phrase.

      Less More
    • September 13, 2016

      Ajay Nagesh

      Information Extraction has become an indispensable tool in our quest to handle the data deluge of the information age. In this talk, we discuss the categorization of complex relational features and outline methods to learn feature combinations through induction. We demonstrate the efficacy of induction techniques in learning rules for the identification of named entities in text – the novelty being the application of induction techniques to learn in a very expressive declarative rule language. Next, we discuss our investigations in the paradigm of distant supervision, which facilitates the creation of large albeit noisy training data. We devise an inference framework in which constraints can be easily specified in learning relation extractors. We reformulate the learning objective in a max-margin framework. To the best of our knowledge, our formulation is the first to optimize multi-variate non-linear performance measures such as F1 for a latent variable structure prediction task. Towards the end, we will briefly touch upon some recent exploratory work to leverage matrix completion methods and novel embedding techniques for predicting a richer fine-grained set of entity types to help in downstream applications such as Relation Extraction and Question Answering.

      Less More
    • September 7, 2016

      Siva Reddy

      I will present three semantic parsing approaches for querying Freebase in natural language 1) training only on raw web corpus, 2) training on question-answer (QA) pairs, and 3) training on both QA pairs and web corpus. For 1 and 2, we conceptualise semantic parsing as a graph matching problem, where natural language graphs built using CCG/dependency logical forms are transduced to Freebase graphs. For 3, I will present a natural-logic approach for Semantic Parsing. Our methods achieve state-of-the-art on WebQuestions and Free917 QA datasets.

      Less More
    • August 23, 2016

      Matthew Peters

      Distributed representations of words, phrases and sentences are central to recent advances in machine translation, language modeling, semantic similarity, and other tasks. In this talk, I'll explore ways to learn similar representations of search queries, web pages and web sites. The first portion of the talk describes a method to learn a keyword-web page similarity function applicable to web search. It represents the web page as a set of attributes (URL, title, meta description tag, etc) and uses a separate LSTM encoder for each attribute. The network is trained end-to-end from clickthrough logs. The second half of the talk introduces a measure of authority for each web page and jointly learns keyword-keyword, keyword-site and keyword-site-authority relationships. The multitask network leverages a shared representation for keywords and sites and learns a fine grained topic authority (for example is an authority on the topic "Bernie Sanders" but not on "Seattle Mariners").

      Less More
    • August 22, 2016

      Jay Pujara

      Automated question answering, knowledgeable digital assistants, and grappling with the massive data flooding the Web all depend on structured knowledge. Precise knowledge graphs capturing the many, complex relationships between entities are the missing piece for many problems, but knowledge graph construction is notoriously difficult. In this talk, I will chronicle common failures from the first generation of information extraction systems and show how combining statistical NLP signals and semantic constraints addresses these problems. My method, Knowledge Graph Identification (KGI), exploits the key lessons of the statistical relational learning community and uses them for better knowledge graph construction. Probabilistic models are often discounted due to scalability concerns, but KGI translates the problem into a tractable convex objective that is amenable to parallelization. Furthermore, the inferences from KGI have provable optimality and can be updated efficiently using approximate techniques that have bounded regret. I demonstrate state-of-the-art performance of my approach on knowledge graph construction and entity resolution tasks on NELL and Freebase, and discuss exciting new directions for KG construction.

      Less More
    • August 1, 2016

      Dan Garrette

      Learning NLP models from weak forms of supervision has become increasingly important as the field moves toward applications in new languages and domains. Much of the existing work in this area has focused on designing learning approaches that are able to make use of small amounts of human-generated data. In this talk, I will present work on a complementary form of inductive bias: universal, cross-lingual principles of how grammars function. I will develop these intuitions with a series of increasingly complex models based in the Combinatory Categorial Grammar (CCG) formalism: first, a supertagging model that biases towards associative adjacent-category relationships; second, a parsing model that biases toward simpler grammatical analyses; and finally, a novel parsing model, with accompanying learning procedure, that is able to exploit both of these biases by parameterizing the relationships between each constituent label and its supertag context to find trees with a better global coherence. We model grammar with CCG because the structured, logic-backed nature of CCG categories and the use of a small universal set of constituent combination rules are ideally suited to encoding as priors, and we train our models within a Bayesian setting that combines these prior beliefs about how natural languages function with the empirical statistics gleaned from large amounts of raw text. Experiments with each of these models show that when training from only partial type-level supervision and a corpus of unannotated text, employing these universal properties as soft constraints yields empirically better models. Additional gains are obtained by further shaping the priors with corpus-specific information that is estimated automatically from the tag dictionary and raw text.

      Less More
    • July 18, 2016

      Claudio Delli Bovi

      The Open Information Extraction (OIE) paradigm has received much attention in the NLP community over the last decade. Since the earliest days, most OIE approaches have been focusing on Web-scale corpora, which raises issues such as massive amounts of noise. Also, OIE systems can be very different in nature and develop their own type inventories, with no portable ontological structure. This talk steps back and explores both issues by presenting two substantially different approaches to the task: in the first we shift the target of a full-fledged OIE pipeline to a relatively small, dense corpus of definitional knowledge; in the second we try to make sense of different OIE outputs by merging them into a single, unified and fully disambiguated knowledge repository.

      Less More
    • July 15, 2016

      Yuxin Chen

      Sequential information gathering, i.e., selectively acquiring the most useful data, plays a key role in interactive machine learning systems. Such problem has been studied in the context of Bayesian active learning and experimental design, decision making, optimal control and numerous other domains. In this talk, we focus on a class of information gathering tasks, where the goal is to learn the value of some unknown target variable through a sequence of informative, possibly noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heuristics, such as greedily performing tests that maximize the reduction in uncertainty of the target, often perform poorly. We propose a class of novel, computationally efficient active learning algorithms, and prove strong theoretical guarantees that hold with correlated, possibly noisy tests. Rather than myopically optimize the value of a test (which, in our case, is the expected reduction in prediction error), at each step, our algorithms pick the test that maximizes the gain in a surrogate objective, which is adaptive submodular. This property enables us to utilize an efficient greedy optimization while providing strong approximation guarantees. We demonstrate our algorithms in several real-world problem instances, including a touch-based location task on an actual robotic platform, and an active preference learning task via pairwise comparisons.

      Less More
    • June 21, 2016

      Katrin Erk

      As the field of Natural Language Processing develops, more ambitious semantic tasks are being addressed, such as Question Answering (QA) and Recognizing Textual Entailment (RTE). Solving these tasks requires (ideally) an in-depth representation of sentence structure as well as expressive and flexible representations at the word level. We have been exploring a combination of logical form with distributional as well as resource-based information at the word level, using Markov Logic Networks (MLNs) to perform probabilistic inference over the resulting representations. In this talk, I will focus on the three main components of a system we have developed for the task of Textual Entailment: (1) Logical representation for processing in MLNs, (2) lexical entailment rule construction by integrating distributional information with existing resources, and (3) probabilistic inference, the problem of solving the resulting MLN inference problems efficiently. I will also comment on how I think the ideas from this system can be adapted to Question Answering and the more general task of in-depth single-document understanding.

      Less More
    • June 20, 2016

      Marcus Rohrbach

      Language is the most important channel for humans to communicate about what they see. To allow an intelligent system to effectively communicate with humans it is thus important to enable it to relate information in words and sentences with the visual world. For this a system should be compositional, so it is e.g. not surprised when it encounters a novel object and can still talk about it. It should be able to explain in natural language, why it recognized a given object in an image as certain class, to allow a human to trust and understand it. However, it should not only be able to generate natural language, but also understand it, and locate sentences and linguistic references in the visual world. In my talk, I will discuss how we approach these different fronts by looking at the tasks of language generation about images, visual grounding, and visual question answering. I will conclude with a discussion of the challenges ahead.

      Less More
    • June 13, 2016

      Megasthenis Asteris

      Principal component analysis (PCA) is one of the most popular tools for identifying structure and extracting interpretable information from datasets. In this talk, I will discuss constrained variants of PCA such as Sparse or Nonnegative PCA that are computationally harder, but offer higher data interpretability. I will describe a framework for solving quadratic optimization problems --such as PCA-- under sparsity or other combinatorial constraints. Our method can surprisingly solve such problems exactly when the involved quadratic form matrix is positive semidefinite and low rank. Of course, real datasets are not low-rank, but they can frequently be well approximated by low-rank matrices. For several datasets, we obtain excellent empirical performance and provable upper bounds that guarantee that our objective is close to the unknown optimum.

      Less More