Menu
Viewing 121-140 of 149 videos See AI2’s full collection of videos on our YouTube channel.
    • March 31, 2015

      In many real-world applications of AI and machine learning, such as natural language processing, computer vision and knowledge base construction, data sources possess a natural internal structure, which can be exploited to improve predictive accuracy. Sometimes the structure can be very large, containing many interdependent inputs and outputs. Learning from data with large internal structure poses many compelling challenges, one of which is that fully-labeled examples (required for supervised learning) are difficult to acquire. This is especially true in applications like image segmentation, annotating video data, and knowledge base construction.

      Less More
    • March 27, 2015

      Sonal Gupta

      Although most work in information extraction (IE) focuses on tasks that have abundant training data, in practice, many IE problems do not have any supervised training data. State-of-the-art supervised techniques like conditional random fields are impractical for such real world applications because: (1) they require large and expensive labeled corpora; (2) it is difficult to interpret them and analyze errors, an often-ignored but important feature; and (3) they are hard to calibrate, for example, to reliably extract only high-precision extractions.

      Less More
    • March 17, 2015

      Congle Zhang

      Most approaches to relation extraction, the task of extracting ground facts from natural language text, are based on machine learning and thus starved by scarce training data. Manual annotation is too expensive to scale to a comprehensive set of relations. Distant supervision, which automatically creates training data, only works with relations that already populate a knowledge base (KB). Unfortunately, KBs such as FreeBase rarely cover event relations (e.g. “person travels to location”). Thus, the problem of extracting a wide range of events — e.g., from news streams — is an important, open challenge.

      Less More
    • March 12, 2015

      Vicente Ordonez

      Recently, there has been great progress in both computer vision and natural language processing in representing and recognizing semantic units like objects, attributes, named entities, or constituents. These advances provide opportunities to create systems able to interpret and describe the visual world using natural language. This is in contrast to traditional computer vision systems, which typically output a set of disconnected labels, object locations, or annotations for every pixel in an image. The rich visually descriptive language produced by people incorporates world knowledge and human intuition that often can not be captured by other types of annotations. In this talk, I will present several approaches that explore the connections between language, perception, and vision at three levels: learning how to name objects, generating referring expressions for objects in natural scenes, and producing general image descriptions. These methods provide a framework to augment computer vision systems with linguistic information and to take advantage of the vast amount of text associated with images on the web. I will also discuss some of the intuitions from linguistics and perception behind these efforts and how they potentially connect to the larger goal of creating visual systems that can better learn from and communicate with people.

      Less More
    • March 11, 2015

      Joel Pfeiffer

      Networks provide an effective representation to model many real-world domains, with edges (e.g., friendships, citations, hyperlinks) representing relationships between items (e.g., individuals, papers, webpages). By understanding common network features, we can develop models of the distribution from which the network was likely sampled. These models can be incorporated into real world tasks, such as modeling partially observed networks for improving relational machine learning, performing hypothesis tests for anomaly detection, or simulating algorithms on large scale (or future) datasets. However, naively sampling networks does not scale to real-world domains; for example, drawing a single random network sample consisting of a billion users would take approximately a decade with modern hardware.

      Less More
    • March 3, 2015

      Ankur Parikh

      Being able to effectively model latent structure in data is a key challenge in modern AI research, particularly in Natural Language Processing (NLP) where it is crucial to discover and leverage syntactic and semantic relationships that may not be explicitly annotated in the training set. Unfortunately, while incorporating latent variables to represent hidden structure can substantially increase representation power, the key problems of model design and learning become significantly more complicated. For example, unlike fully observed models, latent variable models can suffer from non-identifiability, making it difficult to distinguish the desired latent structure from the others. Moreover, learning is usually formulated as a non-convex optimization problem, leading to the use of local search heuristics that may become trapped in local optima.

      Less More
    • February 26, 2015

      Ken Forbus

      Creating systems that can work with people, using natural modalities, as apprentices is a key step towards human-level AI. This talk will describe how my group is combining research on sketch understanding, natural language understanding, and analogical learning within the Companion cognitive architecture to create systems that can reason and learn about science by working with people. Some promising results will be described (e.g. solving conceptual physics problems involving sketches, modeling conceptual change, learning by reading) as well as work in progress (e.g. interactive knowledge capture via analogy).

      Less More
    • February 5, 2015

      Bhavana Dalvi

      Semi-supervised learning (SSL) has been widely used over a decade for various tasks -- including knowledge acquisition-- that lack large amount of training data. My research proposes a novel learning scenario in which the system knows a few categories in advance, but the rest of the categories are unanticipated and need to be discovered from the unlabeled data. With the availability of enormous unlabeled datasets at low cost, and difficulty of collecting labeled data for all possible categories, it becomes even more important to adapt traditional semi-supervised learning techniques to such realistic settings.

      Less More
    • January 7, 2015

      Been Kim

      I will present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the ``quintessential" observations that best represent clusters in a data set, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.

      Less More
    • December 4, 2014

      Aria Haghigi

      I discuss three problems in applied natural language processing and machine learning: event discovery from distributed discourse, document content models for information extraction, and relevance engineering for a large-scale personalization engine. The first two are information extraction problems over social media which attempt to utilize richer structure and context for decision making; these sections reflect work from the tail end of my purely academic work. The relevance section will discuss work done while at my former startup Prismatic and will focus on issues arising from productionizing real-time machine learning. Along the way, I'll share my thoughts and experience around productizing research and interesting future directions.

      Less More
    • December 3, 2014

      Roozbeh Mottaghi

      Scene understanding is one of the holy grails of computer vision, and despite decades of research, it is still considered an unsolved problem. In this talk, I will present a number of methods, which help us take a step further towards the ultimate goal of holistic scene understanding. In particular, I will talk about our work on object detection, 3D pose estimation, and contextual reasoning, and show that modeling these tasks jointly enables better understanding of scenes. At the end of the talk, I will describe our recent work on providing richer descriptions for objects in terms of their viewpoint and sub-category information.

      Less More
    • November 10, 2014

      Alan Akbik

      The use of deep syntactic information such as typed dependencies has been shown to be very effective in Information Extraction (IE). Despite this potential, the process of manually creating rule-based information extractors that operate on dependency trees is not intuitive for persons without an extensive NLP background. In this talk, I present an approach and a graphical tool that allows even novice users to quickly and easily define extraction patterns over dependency trees and directly execute them on a very large text corpus. This enables users to explore a corpus for structured information of interest in a highly interactive and data-guided fashion, and allows them to create extractors for those semantic relations they find interesting. I then present a project in which we use Information Extraction to automatically construct a very large common sense knowledge base. This knowledge base - dubbed "The Weltmodell" - contains common sense facts that pertain to proper noun concepts; an example of this is the concept "coffee", for which we know that it is typically drunk by a person or brought by a waiter. I show how we mine such information from very large amounts of text, how we quantify notions such as typicality and similarity, and discuss some ideas how such world knowledge can be used to address reasoning tasks.

      Less More
    • November 4, 2014

      Raymond Mooney

      Traditional logical approaches to semantics and newer distributional or vector space approaches have complementary strengths and weaknesses.We have developed methods that integrate logical and distributional models by using a CCG-based parser to produce a detailed logical form for each sentence, and combining the result with soft inference rules derived from distributional semantics that connect the meanings of their component words and phrases. For recognizing textual entailment (RTE) we use Markov Logic Networks (MLNs) to combine these representations, and for Semantic Textual Similarity (STS) we use Probabilistic Soft Logic (PSL). We present experimental results on standard benchmark datasets for these problems and emphasize the advantages of combining logical structure of sentences with statistical knowledge mined from large corpora.

      Less More
    • October 1, 2014

      Chris Callison-Burch

      I will present my method for learning paraphrases - pairs of English expressions with equivalent meaning - from bilingual parallel corpora, which are more commonly used to train statistical machine translation systems. My method equates pairs of English phrases like --thrown into jail, imprisoned-- when they share an aligned foreign phrase like festgenommen. Because bitexts are large and because a phrase can be aligned many different foreign phrases including phrases in multiple foreign languages, the method extracts a diverse set of paraphrases. For thrown into jail, we not only learn imprisoned, but also arrested, detained, incarcerated, jailed, locked up, taken into custody, and thrown into prison, along with a set of incorrect/noisy paraphrases. I'll show a number of methods for filtering out the poor paraphrases, by defining a paraphrase probability calculated from translation model probabilities, and by re-ranking the candidate paraphrases using monolingual distributional similarity measures.

      Less More
    • August 5, 2014

      Jonathan Berant

      Machine reading calls for programs that read and understand text, but most current work only attempts to extract facts from redundant web-scale corpora. In this talk, I will focus on a new reading comprehension task that requires complex reasoning over a single document. The input is a paragraph describing a biological process, and the goal is to answer questions that require an understanding of the relations between entities and events in the process. To answer the questions, we first predict a rich structure representing the process in the paragraph. Then, we map the question to a formal query, which is executed against the predicted structure. We demonstrate that answering questions via predicted structures substantially improves accuracy over baselines that use shallower representations.

      Less More
    • July 25, 2014

      Pedro Domingos

      Building very large commonsense knowledge bases and reasoning with them is a long-standing dream of AI. Today that knowledge is available in text; all we have to do is extract it. Text, however, is extremely messy, noisy, ambiguous, incomplete, and variable. A formal representation of it needs to be both probabilistic and relational, either of which leads to intractable inference and therefore poor scalability. In the first part of this talk I will describe tractable Markov logic, a language that is restricted enough to be tractable yet expressive enough to represent much of the commonsense knowledge contained in text. Even then, transforming text into a formal representation of its meaning remains a difficult problem. There is no agreement on what the representation primitives should be, and labeled data in the form of sentence-meaning pairs for training a semantic parser is very hard to come by. In the second part of the talk I will propose a solution to both these problems, based on concepts from symmetry group theory. A symmetry of a sentence is a syntactic transformation that does not change its meaning. Learning a semantic parser for a language is discovering its symmetry group, and the meaning of a sentence is its orbit under the group (i.e., the set of all sentences it can be mapped to by composing symmetries). Preliminary experiments indicate that tractable Markov logic and symmetry-based semantic parsing can be powerful tools for scalably extracting knowledge from text.

      Less More
    • June 4, 2014

      Paul Allen

      Paul Allen discusses his vision for the future of AI and AI2 in this fireside chat moderated by Gary Marcus of New York University at the 10th Anniversary Symposium - Allen Institute for Brain Science. AI2-related discussion begins at 17:30.

      Less More
    • May 13, 2014

      Bart Selman

      In recent years, there has been tremendous progress in solving large-scale reasoning and optimization problems. Central to this progress has been the ability to automatically uncover hidden problem structure. Nevertheless, for the very hardest computational tasks, human ingenuity still appears indispensable. We show that automated reasoning strategies and human insights can effectively complement each other, leading to hybrid human-computer solution strategies that outperform other methods by orders of magnitude. We illustrate our approach with challenges in scientific discovery in the areas of finite mathematics and materials science.

      Less More
    • March 31, 2014

      Dan Roth

      Machine Learning and Inference methods have become ubiquitous and have had a broad impact on a range of scientific advances and technologies and on our ability to make sense of large amounts of data. Research in Natural Language Processing has both benefited from and contributed to advancements in these methods and provides an excellent example for some of the challenges we face moving forward. I will describe some of our research in developing learning and inference methods in pursue of natural language understanding. In particular, I will address what I view as some of the key challenges, including (i) learning models from natural interactions, without direct supervision, (ii) knowledge acquisition and the development of inference models capable of incorporating knowledge and reason, and (iii) scalability and adaptation—learning to accelerate inference during the life time of a learning system.

      Less More
    • February 26, 2014

      Brendan O'Connor

      What can text analysis tell us about society? Corpora of news, books, and social media encode human beliefs and culture. But it is impossible for a researcher to read all of today's rapidly growing text archives. My research develops statistical text analysis methods that measure social phenomena from textual content, especially in news and social media data. For example: How do changes to public opinion appear in microblogs? What topics get censored in the Chinese Internet? What character archetypes recur in movie plots? How do geography and ethnicity affect the diffusion of new language? Less

      Less More