Viewing 1-20 of 143 videos See AI2’s full collection of videos on our YouTube channel.
    • April 8, 2019

      Swabha Swayamdipta

      As the availability of data for language learning grows, the role of linguistic structure is under scrutiny. At the same time, it is imperative to closely inspect patterns in data which might present loopholes for models to obtain high performance on benchmarks. In a two-part talk, I will address each of these challenges.

      First, I will introduce the paradigm of scaffolded learning. Scaffolds enable us to leverage inductive biases from one structural source for prediction of a different, but related structure, using only as much supervision as is necessary. We show that the resulting representations achieve improved performance across a range of tasks, indicating that linguistic structure remains beneficial even with powerful deep learning architectures.

      In the second part of the talk, I will showcase some of the properties exhibited by NLP models in large data regimes. Even as these models report excellent performance, sometimes claimed to beat humans, a closer look reveals that predictions are not a result of complex reasoning, and the task is not being completed in a generalizable way. Instead, this success can be largely attributed to exploitation of some artifacts of annotation in the datasets. I will discuss some questions our finding raises, as well as directions for future work.

      Less More
    • April 3, 2019

      Arzoo Katiyar

      Extracting information from text entails deriving a structured, and typically domain-specific, representation of entities and relations from unstructured text. The information thus extracted can potentially facilitate applications such as question answering, information retrieval, conversational dialogue and opinion analysis. However, extracting information from text in a structured form is difficult: it requires understanding words and the relations that exist between them in the context of both the current sentence and the document as a whole.

      In this talk, I will present my research on neural models that learn structured output representations comprised of textual mentions of entities and relations within a sentence. In particular, I will propose the use of novel output representations that allow the neural models to learn better dependencies in the output structure and achieve state-of-the-art performance on both tasks as well as on nested variations. I will also describe our recent work on expanding the input context beyond sentences by incorporating coreference resolution to learn entity-level rather than mention-level representations and show that these representations can capture the information regarding the saliency of entities in the document.

      Less More
    • March 29, 2019

      Daniel Khashabi

      Can we solve language understanding tasks without relying on task-specific annotated data? This could be important in scenarios where the inputs range across various domains and it is expensive to create annotated data.

      I discuss two different language understanding problems (Question Answering and Entity Typing) which have traditionally relied on on direct supervision. For these problems, I present two recent works where exploiting properties of the underlying representations and indirect signals help us move beyond traditional paradigms. And as a result, we observe better generalization across domains.

      Less More
    • March 11, 2019

      Rohit Girdhar

      Humans are arguably one of the most important entities that AI systems would need to understand to be useful and ubiquitous. From autonomous cars observing pedestrians, to assistive robots helping the elderly, a large part of this understanding is focused on recognizing human actions, and potentially, their intentions. Humans themselves are quite good at this task: we can look at a person and explain in great detail every action they are doing. Moreover, we can reason over those actions over time, and even predict what potential actions they may intend do in the future. Computer vision algorithms, on the other hand, have lagged far behind on this task. In my research, I’ve explored techniques to improve human action understanding from a visual input, with the key insight being that human actions are dependent on the state of their environment (parameterized by the scene and the objects in it) apart from their own state (parameterized by their pose). In this talk, I will talk about three key ways I exploit this dependence: (1) Learning to aggregate this contextual information to recognize human actions; (2) Predicting a prior on human actions by learning about the affordances of the scenes and objects they interact with; and finally, (3) Moving towards longer term temporal reasoning through a new dataset and benchmark tasks.

      Less More
    • March 7, 2019

      "An Ethical Crisis in Computing?" Moshe Vardi | Karen Ostrum George Distinguished Professor, Computational Engineering, Rice University

      "Algorithmic Accountability: Designing for Safety" Ben Shneiderman | Distinguished Professor, Department of Computer Science, University of Maryland, College Park

      "AI Policy: What to Do Now, Soon, and One Day" Ryan Calo | Lane Powell & D. Wayne Gittinger Associate Professor of Law, University of Washington

      "Less Talk, More Do: Applied Ethics in AI" Tracy Kosa | Adjunct Professor, Faculty of Law and Albers School of Business, Seattle University

      Panel Q&A Oren Etzioni and speakers

      Less More
    • March 1, 2019

      Reut Tsarfaty

      Can we program computers in our native tongue? This idea, termed natural language programming (NLPRO), has attracted attention almost since the inception of computers themselves.

      From the point of view of software engineering (SE), efforts to program in natural language (NL) have relied thus far on controlled natural languages (CNL) -- small unambiguous fragments of English with restricted grammars and limited expressivity. Is it possible to replace these CNLs with truly natural, human language? From the point of view of natural language processing (NLP), current technology successfully extracts information from NL texts. However, the level of NL understanding required for programming in NL goes far beyond such information extraction. Is it possible to endow computers with a dynamic kind of NL understanding? In this talk I argue that the solutions to these seemingly separate challenges are actually closely intertwined, and that one community's challenge is the other community's stepping stone for a huge leap and vice versa.

      Specifically, in this talk I propose to view executable programs in SE as semantic structures in NLP, as the basis for broad-coverage semantic parsing. I present a feasibility study on the semantic parsing of requirements documents into executable scenarios, where the requirements are written in a restricted yet highly ambiguous fragment of English, and the target representation employs live sequence charts (LSC), a multi-modal executable programming language. The parsing architecture I propose jointly models sentence-level and discourse-level processing in a generative probabilistic framework. I empirically show that the discourse-based model consistently outperforms the sentence-based model, constructing a system that reflects both the static (entities, properties) and dynamic (behavioral scenarios) requirements in the input document.

      Less More
    • February 5, 2019

      Julia Lane

      The social sciences are at a crossroads The great challenges of our time are human in nature - terrorism, climate change, the use of natural resources, and the nature of work - and require robust social science to understand the sources and consequences. Yet the lack of reproducibility and replicability evident in many fields is even more acute in the study of human behavior both because of the difficulty of sharing confidential data and because of the lack of scientific infrastructure. Much of the core infrastructure is manual and ad-hoc in nature, threatening the legitimacy and utility of social science research.

      A major challenge is search and discovery. The vast majority of social science data and outputs cannot be easily discovered by other researchers even when nominally deposited in the public domain. A new generation of automated search tools could help researchers discover how data are being used, in what research fields, with what methods, with what code and with what findings. And automation can be used to reward researchers who validate the results and contribute additional information about use, fields, methods, code, and findings. In sum, the use of data depends critically on knowing how it has been produced and used before: the required elements what do the data measure, what research has been done by what researchers, with what code, and with what results.

      In this presentation I describe the work that we are doing to build and develop automated tools to create the equivalent of an or TripAdvisor for the access and use of confidential microdata.

      Less More
    • January 25, 2019

      Qiang Nign

      Time is an important dimension when we describe the world because the world is evolving over time and many facts are time-sensitive. Understanding time is thus an important aspect of natural language understanding and many applications may rely on it, e.g., information retrieval, summarization, causality, and question answering.

      In this talk, I will mainly focus on a key component of it, temporal relation extraction. The task has long been challenging because the actual timestamps of those events are rarely expressed explicitly and their temporal order has to be inferred, from lexical cues, between the lines, and often based on strong background knowledge. Additionally, collecting enough and high-quality annotations to facilitate machine learning algorithms for this task is also difficult, which makes it even more challenging to investigate the task. I tackled this task in three perspectives, structured learning, common sense, and data collection, and have improved the state-of-the-art by approximately 20% in absolute F1. My current system, CogCompTime, is available at this online demo: In the future, I expect to expand my research in these directions to other core problems in AI such as incidental supervision, semantic parsing, and knowledge representation.

      Less More
    • January 11, 2019

      Rik Koncel-Kedziorski

      In this talk I will introduce a new model for encoding knowledge graphs and generating texts from them. Graphical knowledge representations are ubiquitous in computing, but pose a challenge for text generation techniques due to their non-hierarchical structure and collapsing of long-distance dependencies. Moreover, automatically extracted knowledge is noisy, and so requires a text generation model be robust. To address these issues, I introduce a novel attention-based encoder-decoder model for knowledge-graph-to-text generation. This model extends the popular Transformer for text encoding to function over graph-structured inputs. The result is a powerful, general model for graph encoding which can incorporate global structural information when contextualizing vertices in their local neighborhoods. Through detailed automatic and human evaluations I demonstrate the value of conditioning text generation on graph-structured knowledge, as well as the superior performance of the proposed model compared to recent work.

      Less More
    • December 14, 2018

      Tal Linzen

      Recent technological advances have made it possible to train recurrent neural networks (RNNs) on a much larger scale than before. While these networks have proved effective in NLP applications, their limitations and the mechanisms by which they accomplish their goals are poorly understood. In this talk, I will show how methods from cognitive science can help elucidate and improve the syntactic representations employed by RNN language models. I will review evidence that RNN language models are able to process syntactic dependencies in typical sentences with considerable success across languages (Linzen et al 2016, TACL; Gulordava et al. 2018, NAACL). However, when evaluated on experimentally controlled materials, their error rate increases sharply; explicit syntactic supervision mitigates the drop in performance (Marvin & Linzen 2018, EMNLP). Finally, I will discuss how language model adaptation can provide a tool for probing RNN syntactic representations, following the inspiration of the syntactic priming paradigm from psycholinguistics (van Schijndel & Linzen 2018, EMNLP).

      Less More
    • December 12, 2018

      Panupong (Ice) Pasupat

      Natural language understanding models have achieved good enough performance for commercial products such as virtual assistants. However, their scopes are mostly still limited to preselected domains or simpler sentences. I will present my line of work which extends natural language understanding in two frontiers: handling open-domain environments such as the Web (breadth) and handling complex sentences (depth).

      The presentation will focus on the task of answering complex questions on semi-structured Web tables using question-answer pairs as supervision. Within the framework of semantic parsing, which is to learn to parse sentences into executable logical forms, I will explain our proposed methods to (1) flexibly handle lexical and syntactic mismatches between the questions and logical forms, (2) filter misleading logical forms that sometimes give correct answers, and (3) reuse parts of good logical forms to make training more efficient. I will also briefly mention how these ideas can be applied to several other natural language understanding tasks for Web interaction.

      Less More
    • December 11, 2018

      Abhisek Das

      Building intelligent agents that possess the ability to perceive the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and execute actions in a physical environment, is a long-term goal of Artificial Intelligence. In this talk, I will present some of my recent work at various points on this spectrum in connecting vision and language to actions; from Visual Dialog (CVPR17, ICCV17, HCOMP17) -- where we develop models capable of holding free-form visually-grounded natural language conversation towards a downstream goal and ways to evaluate them -- to Embodied Question Answering (CVPR18, CoRL18) -- where we augment these models to actively navigate in simulated environments and gather visual information necessary for answering questions.

      Less More
    • December 6, 2018

      Oren Etzioni

      Dr. Oren Etzioni, Chief Executive Officer of the Allen Institute of Artificial Intelligence and professor of computer science at the University of Washington, addresses one of the Holy Grails of AI: acquiring, representing and utilizing common-sense knowledge, during a distinguished lecture series held at the Office of Naval Research.

      Less More
    • November 16, 2018

      Shyam Upadhyay

      Lack of annotated data is a constant obstacle in developing machine learning models, especially for natural language processing (NLP) tasks. In this talk, I explore this problem in the realm of Multilingual NLP, where the challenges become more acute as most of the annotation efforts in the NLP community have been predominantly aimed at English.

      In particular, I will discuss two techniques for overcoming the lack of annotation in multilingual settings. I focus on two information extraction tasks --- cross-lingual entity linking and name transliteration to English --- for which traditional approaches rely on generous amounts of supervision in the language of interest. In the first part of the talk, I show how we can perform cross-lingual entity linking by sharing supervision across languages through a shared multilingual feature space. This approach enables us to complement the supervision in a low-resource language with supervision from a high resource language. In the second part, I show how we use freely available knowledge and unlabeled data to substitute for lack of supervision for the transliteration task. Key to the approach is a constrained bootstrapping algorithm that mines new example pairs for improving the transliteration model. Results on both tasks show the effectiveness of these approaches, and pave the way for future tasks involving the 3-way interaction of text, knowledge, and reasoning, in a multilingual setting.

      Less More
    • November 12, 2018

      Kevin Jamieson

      In many science and industry applications, data-driven discovery is limited by the rate of data collection like the time it takes skilled labor to operate a pipette or the cost of expensive reagents or use of experimental apparatuses. When measurement budgets are necessarily small, adaptive data collection that uses previously collected data to inform future data collection in a closed loop can make the difference between inferring a phenomenon or not. While methods like multi-armed bandits have provided great insights into optimal means of collecting data in the last several years, these algorithms require a number of measurements that scales linearly with the total number of possible actions or measurements that can be made, even if discovering just one among possibly many true positives is desired. For example, if many of our 20,000 genes are critical for cell-growth and a measurement corresponds to knocking out just one gene and measuring a noisy phenotype signal, one may expect that we can find a single influential gene with far fewer than 20,000 total measurements. In this talk I will ground this intuition in a theoretical framework and describe several applications where I have applied this perspective and new algorithms including crowd-sourcing preferences, multiple testing with false discovery control, hyperparameter tuning, and crowdfunding.

      Less More
    • October 26, 2018

      Sam Thomson

      Is there a class of models that perform competitively with LSTMs, yet are interpretable, parallelizable, data-efficient, and whose mathematical properties are already well-studied? I will present a recent line of work where we show that weighted finite-state automata (WFSAs) can be made unreasonably effective sequence encoders by letting their transition weights be calculated by neural nets.

      First, we introduce a specific architecture, Soft Patterns (SoPa), which generalizes convolutional neural networks (CNNs), capturing fixed-length but gappy patterns. We show that SoPa is competitive with LSTMs at text classification, and even outperforms LSTMs in small data regimes.

      Next, we explore the limits of this general approach. We show that several existing recurrent neural networks (RNNs) are in fact WFSAs in disguise, including quasi-recurrent neural networks, simple recurrent units, input switched affine networks, and more. These networks are already in popular use, showing strong performance on a variety of tasks. We formally define and characterize this class of RNNs, which include CNNs but not arbitrary RNNs, dubbing them "rational recurrences."

      Less More
    • October 22, 2018

      Chelsea Finn

      Machine learning excels primarily in settings where an engineer can first reduce the problem to a particular function, and collect a substantial amount of labeled input-output pairs for that function. In drastic contrast, humans are capable of learning a range of versatile behaviors from streams of raw sensory data with minimal external instruction. How can we develop machines that learn more like the latter? In this talk, I will discuss recent work on learning versatile behaviors from raw sensory observations with minimal human supervision. In particular, I will show how we can use meta-learning to infer goals and intentions from humans with only a few positive examples, how robots can leverage large amounts of unlabeled experience to develop and plan with visual predictive models of the world, and how we can combine elements of meta-learning and unsupervised learning to develop agents that propose their own goals and learn to achieve them.

      Less More
    • October 17, 2018

      Rishabh Iyer

      Visual Data in the form of Images and Videos have been growing at an unprecedented rate in the last few years. While this massive data is a blessing to data science by helping improve predictive accuracy, it is also a curse since humans are unable to consume this large amount of data. Moreover, today, machine generated videos (via Drones, Dash-cams, Body-cams, Security cameras etc.) are being generated at a rate higher than what we as humans can process, and majority of this data is plagued with redundancy. In this talk, I will present a unified framework for Submodular Optimization which provides an end to end solution to these problems. We first show that submodular functions naturally model notions of diversity, coverage, representation and information. Moreover they also lend themselves to practical and provably near optimal algorithms for optimization, thereby providing practical data summarization strategies. Along the way, we will highlight several implementational aspects of submodular optimization, including memoization tricks useful in building real world summarization systems.

      We also show how we can efficiently learn submodular functions for different domains and tasks. We will demonstrate the utility of this in summarization tasks related to visual data: Image collection summarization and domain specific video summarization. What comprises a good visual summary depends on the domain at hand -- creating a video summary of a soccer game will involve very different modeling characteristics compared to a surveillance video. We try to take a principled approach towards domain specific video summarization, we argue how we can efficiently learn the right weights for the different model families. We shall point out several interesting observations and insights learnt from this characterization. Towards the end of this talk, we shall extend this work to training data subset selection, where we shall show how we can use our summarization framework for reducing training complexity, quick turn-around times for hyper-parameter tuning and Diversified Active Learning.

      Less More
    • October 10, 2018

      Lucy Wang

      Human interpretability is essential in biomedicine, because information flow between computational platforms and human stakeholders is crucial to the proper management and care of disease. Biomedical data is abundant, but do not lend themselves to easy summary and interpretation. Luckily, there are many structured biomedical knowledge resources that can be used to assist in the analysis of all these data. How best to integrate ontological data with contemporary machine learning techniques is one of my main research interest, the other of which is to apply these integrated techniques to enhancing our understanding of specific human diseases.

      My research can by summarized into two themes: 1) the development of tools for modeling biomedical knowledge, and 2) the application of biomedical knowledge and natural language processing techniques to understanding biomedical and clinical texts. In this talk, I will describe a few of my projects and propose ways to extend some of these research ideas in the future.

      Less More
    • October 1, 2018

      Ana Marasovic

      Abstract Anaphora Resolution (AAR) is a challenging task of finding a (typically) non-nominal antecedent of pronouns and noun phrases that refer to abstract objects like facts, events, actions or situations, in the (typically) preceding discourse. An example is given below.

      Our intuition is that we can learn what is the correct antecedent for a given abstract anaphor by learning attributes of the relation that holds between the sentence with the abstract anaphor and its antecedent. We propose a siamese-LSTM mention-ranking model to learn what characterizes mentioned relations [1].

      Although the current resources for AAR are really scarce, we can train our models on many instances of antecedent-anaphoric sentence pairs. Such pairs can be automatically extracted from parsed corpora by searching for constructions with embedded sentences, applying a simple transformation that replaces the embedded sentence with an abstract anaphor and using the cut-off embedded sentence as the antecedent [1].

      I will show results of the mention-ranking model trained for shell noun resolution [2] and results on an abstract anaphora subset of the ARRAU corpus [3]. Finally, I will discuss ideas on how the training data extraction method and the mention-ranking model could be further improved for the challenges ahead. In particular, I will talk about:

      (i) quality of harvested training data to answer whether nominal and pronominal anaphors be learned independently, (ii) selecting antecedents from a wider preceding window, (iii) addressing differences between anaphora types with multi-task learning, (iv) addressing differenced in harvested and natural data with adversarial training, (v) utilizing pretrained language models.

      Less More