Viewing 32 videos from 2018 in Talks by Visiting Speakers See AI2’s full collection of videos on our YouTube channel.
    • December 14, 2018

      Tal Linzen

      Recent technological advances have made it possible to train recurrent neural networks (RNNs) on a much larger scale than before. While these networks have proved effective in NLP applications, their limitations and the mechanisms by which they accomplish their goals are poorly understood. In this talk, I will show how methods from cognitive science can help elucidate and improve the syntactic representations employed by RNN language models. I will review evidence that RNN language models are able to process syntactic dependencies in typical sentences with considerable success across languages (Linzen et al 2016, TACL; Gulordava et al. 2018, NAACL). However, when evaluated on experimentally controlled materials, their error rate increases sharply; explicit syntactic supervision mitigates the drop in performance (Marvin & Linzen 2018, EMNLP). Finally, I will discuss how language model adaptation can provide a tool for probing RNN syntactic representations, following the inspiration of the syntactic priming paradigm from psycholinguistics (van Schijndel & Linzen 2018, EMNLP).

      Less More
    • December 12, 2018

      Panupong (Ice) Pasupat

      Natural language understanding models have achieved good enough performance for commercial products such as virtual assistants. However, their scopes are mostly still limited to preselected domains or simpler sentences. I will present my line of work which extends natural language understanding in two frontiers: handling open-domain environments such as the Web (breadth) and handling complex sentences (depth).

      The presentation will focus on the task of answering complex questions on semi-structured Web tables using question-answer pairs as supervision. Within the framework of semantic parsing, which is to learn to parse sentences into executable logical forms, I will explain our proposed methods to (1) flexibly handle lexical and syntactic mismatches between the questions and logical forms, (2) filter misleading logical forms that sometimes give correct answers, and (3) reuse parts of good logical forms to make training more efficient. I will also briefly mention how these ideas can be applied to several other natural language understanding tasks for Web interaction.

      Less More
    • December 11, 2018

      Abhisek Das

      Building intelligent agents that possess the ability to perceive the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and execute actions in a physical environment, is a long-term goal of Artificial Intelligence. In this talk, I will present some of my recent work at various points on this spectrum in connecting vision and language to actions; from Visual Dialog (CVPR17, ICCV17, HCOMP17) -- where we develop models capable of holding free-form visually-grounded natural language conversation towards a downstream goal and ways to evaluate them -- to Embodied Question Answering (CVPR18, CoRL18) -- where we augment these models to actively navigate in simulated environments and gather visual information necessary for answering questions.

      Less More
    • November 16, 2018

      Shyam Upadhyay

      Lack of annotated data is a constant obstacle in developing machine learning models, especially for natural language processing (NLP) tasks. In this talk, I explore this problem in the realm of Multilingual NLP, where the challenges become more acute as most of the annotation efforts in the NLP community have been predominantly aimed at English.

      In particular, I will discuss two techniques for overcoming the lack of annotation in multilingual settings. I focus on two information extraction tasks --- cross-lingual entity linking and name transliteration to English --- for which traditional approaches rely on generous amounts of supervision in the language of interest. In the first part of the talk, I show how we can perform cross-lingual entity linking by sharing supervision across languages through a shared multilingual feature space. This approach enables us to complement the supervision in a low-resource language with supervision from a high resource language. In the second part, I show how we use freely available knowledge and unlabeled data to substitute for lack of supervision for the transliteration task. Key to the approach is a constrained bootstrapping algorithm that mines new example pairs for improving the transliteration model. Results on both tasks show the effectiveness of these approaches, and pave the way for future tasks involving the 3-way interaction of text, knowledge, and reasoning, in a multilingual setting.

      Less More
    • November 12, 2018

      Kevin Jamieson

      In many science and industry applications, data-driven discovery is limited by the rate of data collection like the time it takes skilled labor to operate a pipette or the cost of expensive reagents or use of experimental apparatuses. When measurement budgets are necessarily small, adaptive data collection that uses previously collected data to inform future data collection in a closed loop can make the difference between inferring a phenomenon or not. While methods like multi-armed bandits have provided great insights into optimal means of collecting data in the last several years, these algorithms require a number of measurements that scales linearly with the total number of possible actions or measurements that can be made, even if discovering just one among possibly many true positives is desired. For example, if many of our 20,000 genes are critical for cell-growth and a measurement corresponds to knocking out just one gene and measuring a noisy phenotype signal, one may expect that we can find a single influential gene with far fewer than 20,000 total measurements. In this talk I will ground this intuition in a theoretical framework and describe several applications where I have applied this perspective and new algorithms including crowd-sourcing preferences, multiple testing with false discovery control, hyperparameter tuning, and crowdfunding.

      Less More
    • October 26, 2018

      Sam Thomson

      Is there a class of models that perform competitively with LSTMs, yet are interpretable, parallelizable, data-efficient, and whose mathematical properties are already well-studied? I will present a recent line of work where we show that weighted finite-state automata (WFSAs) can be made unreasonably effective sequence encoders by letting their transition weights be calculated by neural nets.

      First, we introduce a specific architecture, Soft Patterns (SoPa), which generalizes convolutional neural networks (CNNs), capturing fixed-length but gappy patterns. We show that SoPa is competitive with LSTMs at text classification, and even outperforms LSTMs in small data regimes.

      Next, we explore the limits of this general approach. We show that several existing recurrent neural networks (RNNs) are in fact WFSAs in disguise, including quasi-recurrent neural networks, simple recurrent units, input switched affine networks, and more. These networks are already in popular use, showing strong performance on a variety of tasks. We formally define and characterize this class of RNNs, which include CNNs but not arbitrary RNNs, dubbing them "rational recurrences."

      Less More
    • October 22, 2018

      Chelsea Finn

      Machine learning excels primarily in settings where an engineer can first reduce the problem to a particular function, and collect a substantial amount of labeled input-output pairs for that function. In drastic contrast, humans are capable of learning a range of versatile behaviors from streams of raw sensory data with minimal external instruction. How can we develop machines that learn more like the latter? In this talk, I will discuss recent work on learning versatile behaviors from raw sensory observations with minimal human supervision. In particular, I will show how we can use meta-learning to infer goals and intentions from humans with only a few positive examples, how robots can leverage large amounts of unlabeled experience to develop and plan with visual predictive models of the world, and how we can combine elements of meta-learning and unsupervised learning to develop agents that propose their own goals and learn to achieve them.

      Less More
    • October 17, 2018

      Rishabh Iyer

      Visual Data in the form of Images and Videos have been growing at an unprecedented rate in the last few years. While this massive data is a blessing to data science by helping improve predictive accuracy, it is also a curse since humans are unable to consume this large amount of data. Moreover, today, machine generated videos (via Drones, Dash-cams, Body-cams, Security cameras etc.) are being generated at a rate higher than what we as humans can process, and majority of this data is plagued with redundancy. In this talk, I will present a unified framework for Submodular Optimization which provides an end to end solution to these problems. We first show that submodular functions naturally model notions of diversity, coverage, representation and information. Moreover they also lend themselves to practical and provably near optimal algorithms for optimization, thereby providing practical data summarization strategies. Along the way, we will highlight several implementational aspects of submodular optimization, including memoization tricks useful in building real world summarization systems.

      We also show how we can efficiently learn submodular functions for different domains and tasks. We will demonstrate the utility of this in summarization tasks related to visual data: Image collection summarization and domain specific video summarization. What comprises a good visual summary depends on the domain at hand -- creating a video summary of a soccer game will involve very different modeling characteristics compared to a surveillance video. We try to take a principled approach towards domain specific video summarization, we argue how we can efficiently learn the right weights for the different model families. We shall point out several interesting observations and insights learnt from this characterization. Towards the end of this talk, we shall extend this work to training data subset selection, where we shall show how we can use our summarization framework for reducing training complexity, quick turn-around times for hyper-parameter tuning and Diversified Active Learning.

      Less More
    • October 10, 2018

      Lucy Wang

      Human interpretability is essential in biomedicine, because information flow between computational platforms and human stakeholders is crucial to the proper management and care of disease. Biomedical data is abundant, but do not lend themselves to easy summary and interpretation. Luckily, there are many structured biomedical knowledge resources that can be used to assist in the analysis of all these data. How best to integrate ontological data with contemporary machine learning techniques is one of my main research interest, the other of which is to apply these integrated techniques to enhancing our understanding of specific human diseases.

      My research can by summarized into two themes: 1) the development of tools for modeling biomedical knowledge, and 2) the application of biomedical knowledge and natural language processing techniques to understanding biomedical and clinical texts. In this talk, I will describe a few of my projects and propose ways to extend some of these research ideas in the future.

      Less More
    • October 1, 2018

      Ana Marasovic

      Abstract Anaphora Resolution (AAR) is a challenging task of finding a (typically) non-nominal antecedent of pronouns and noun phrases that refer to abstract objects like facts, events, actions or situations, in the (typically) preceding discourse. An example is given below.

      Our intuition is that we can learn what is the correct antecedent for a given abstract anaphor by learning attributes of the relation that holds between the sentence with the abstract anaphor and its antecedent. We propose a siamese-LSTM mention-ranking model to learn what characterizes mentioned relations [1].

      Although the current resources for AAR are really scarce, we can train our models on many instances of antecedent-anaphoric sentence pairs. Such pairs can be automatically extracted from parsed corpora by searching for constructions with embedded sentences, applying a simple transformation that replaces the embedded sentence with an abstract anaphor and using the cut-off embedded sentence as the antecedent [1].

      I will show results of the mention-ranking model trained for shell noun resolution [2] and results on an abstract anaphora subset of the ARRAU corpus [3]. Finally, I will discuss ideas on how the training data extraction method and the mention-ranking model could be further improved for the challenges ahead. In particular, I will talk about:

      (i) quality of harvested training data to answer whether nominal and pronominal anaphors be learned independently, (ii) selecting antecedents from a wider preceding window, (iii) addressing differences between anaphora types with multi-task learning, (iv) addressing differenced in harvested and natural data with adversarial training, (v) utilizing pretrained language models.

      Less More
    • September 27, 2018

      Nicolas Fiorini

      PubMed is a free search engine for the biomedical literature accessed by millions of users from around the world each day. With the rapid growth of biomedical literature, finding and retrieving the most relevant papers for a given query is increasingly challenging. I will introduce Best Match, the new relevance search algorithm for PubMed that leverages click logs and learning-to-rank. The Best Match algorithm is trained with past user searches with dozens of relevance ranking signals (factors), the most important being the past usage of an article, publication date, BM25 score, and the type of article. This new algorithm demonstrated state-of-the-art retrieval performance in benchmarking experiments as well as an improved user experience in real-world testing.

      Less More
    • August 29, 2018

      Robin Jia

      Reading comprehension systems that answer questions over a context passage can often achieve high test accuracy, but they are frustratingly brittle: they often rely heavily on superficial cues, and therefore struggle on out-of-domain inputs. In this talk, I will describe our work on understanding and challenging these systems. First, I will show how to craft adversarial reading comprehension examples by adding irrelevant distracting text to the context passage. Next, I will present the newest version of the SQuAD dataset, SQuAD 2.0, which tests whether models can distinguish answerable questions from similar but unanswerable ones. Finally, I will propose a new way of evaluating reading comprehension systems by measuring their zero-shot performance on other NLP tasks, such as relation extraction or semantic parsing, that have been converted to textual question answering problems.

      Less More
    • August 24, 2018

      Sebastian Ruder

      Deep neural networks excel at learning from labeled data. In contrast, learning from unlabeled data, especially under domain shift, which is common in many real-world applications, remains a challenge. In this talk, I will touch on three aspects of learning under domain shift: First I will discuss an approach to select relevant data for domain adaptation in order to minimize negative transfer. Secondly, I will show how classic bootstrapping algorithms can be applied to neural networks and that they make for strong baselines in this challenging setting. Finally, I will describe new methods to use language models for semi-supervised learning.

      Less More
    • August 21, 2018

      Chen Liang

      Learning to generate programs from natural language can support a wide range of applications including question answering, virtual assistant, AutoML, etc. It is natural to apply reinforcement learning to directly optimize the task reward, and generalization to new unseen inputs is crucial. However, three challenges need to be addressed: (1) how to model the structures in the programs; (2) how to efficiently learn from sparse rewards; (3) how to explore a large search space. In this talk, I will present (1) Neural Symbolic Machines (NSM), a hybrid framework that integrates a neural “programmer” with a symbolic "computer" to generate programs for multi-step reasoning; (2) Memory Augmented Policy optimization (MAPO), a novel policy optimization formulation that incorporates a memory buffer of promising trajectories to reduce the variance of policy gradient estimates, especially given sparse rewards. NSM with MAPO is the first end-to-end model trained with RL that achieves new state-of-the-art on weakly supervised semantic parsing, evaluated on 3 well-established benchmarks: WebQuestionsSP, WikiTableQuestions, and WikiSQL.

      Less More
    • August 6, 2018

      Pradeep Dasigi

      Natural Language Understanding systems typically involve encoding and reasoning components that are trained end-to-end to produce task-specific outputs given human utterances as inputs. I will talk about the role of external knowledge in making both these components better, and describe NLU systems that benefit from incorporating background and contextual knowledge. First, I will describe an approach for augmenting recurrent neural network models for encoding sentences, with background knowledge from knowledge bases like WordNet. I show that the resulting ontology-grounded context-sensitive representations of words lead to improvements in predicting prepositional phrase attachments and textual entailment.

      Second, I will focus on reasoning, and talk about complex question answering (QA) over structured contexts like tables and images. These QA tasks can be seen as semantic parsing problems, with supervision provided only in the form of answers, and not logical forms. I will discuss the challenges involved in the setup, and discuss three ways of exploiting contextual knowledge to deal with them: 1) use a grammar to constrain the output space of the decoder in a seq2seq model, 2) incorporate a minimal lexicon to bias the seq2seq model towards logical forms that are relevant to the utterances, and finally 3) exploit the compositionality of the logical form language to define a novel iterative training procedure for semantic parsers.

      Less More
    • June 26, 2018

      Chaitanya Malaviya

      Morphological analysis involves predicting the syntactic traits of a word (e.g. {POS: Noun, Case: Acc, Gender: Fem}). Previous work in morphological tagging improves performance for low-resource languages (LRLs) through cross-lingual training with a high-resource language (HRL) from the same family, but is limited by the strict---often false---assumption that tag sets exactly overlap between the HRL and LRL. In this paper we propose a method for cross-lingual morphological tagging that aims to improve information sharing between languages by relaxing this assumption. The proposed model uses factorial conditional random fields with neural network potentials, making it possible to (1) utilize the expressive power of neural network representations to smooth over superficial differences in the surface forms, (2) model pairwise and transitive relationships between tags, and (3) accurately generate tag sets that are unseen or rare in the training data. Experiments on four languages from the Universal Dependencies Treebank demonstrate superior tagging accuracies over existing cross-lingual approaches.

      Less More
    • June 13, 2018

      Hao Fang

      Engaging users in long, open-domain conversations with a chatbot remains a challenging research problem. Unlike task-oriented dialog systems which aim to accomplish small tasks quickly, users expect a broader variety of experiences from conversational chatbots (e.g., companionship, discussing recent news, or entertainment). The recent Alexa Prize has provided a new platform for researchers to build and test such open-domain dialog systems, i.e., socialbots, by allowing systems to interact with millions of real users through Alexa-enabled devices. The first part of this talk presents Sounding Board (winner of 2017 Alexa Prize) and discusses how Sounding Board uses massive and dynamically changing online contents to engage users in a coherent social conversation. While the Alexa platform provides an opportunity for getting real user feedback on a very large scale, some challenges remain. The second half of the talk focuses on addressing the challenge of scoring long socialbot conversations which cover several different topics. Using a large collection of Alexa Prize conversations, we study agent, content, and user factors that correlate with user ratings. We demonstrate approaches to estimate ratings at multiple levels of a long socialbot conversation.

      Less More
    • June 7, 2018

      Vered Shwartz

      Recognizing lexical inferences is one of the building blocks of natural language understanding. Lexical inference corresponds to a semantic relation that holds between two lexical items (words and multi-word expressions), when the meaning of one can be inferred from the other. In reading comprehension, for example, answering the question "which phones have long-lasting batteries?" given the text "Galaxy has a long-lasting battery", requires knowing that Galaxy is a model of a phone. In text summarization, lexical inference can help identifying redundancy, when two candidate sentences for the summary differ only in terms that hold a lexical inference relation (e.g. "the battery is long-lasting" and "the battery is enduring"). In this talk, I will present our work on automatic acquisition of lexical semantic relations from free text, focusing on two methods: the first is an integrated path-based and distributional method for recognizing lexical semantic relations (e.g. cat is a type of animal, tail is a part of cat). The second method focuses on the special case of interpreting the implicit semantic relation that holds between the constituent words of a noun compound (e.g. olive oil is made of olives, while baby oil is for babies).

      Less More
    • May 18, 2018

      Hany Hassan

      Machine translation has made rapid advances in recent years. Millions of people are using it today in online translation systems and mobile applications in order to communicate across language barriers. The question naturally arises whether such systems can approach or achieve parity with human translations. In this talk, we first describe our recent advances in Nerul Machine translation that led to SOTA results on news translation. We then address the problem of how to define and accurately measure human parity in translation. We will discuss our system achieving human performance and discuss limitations as well as future directions of current NMT systems.

      Less More
    • May 8, 2018

      Saining Xie

      With the support of big-data and big-compute, deep learning has reshaped the landscape of research and applications in artificial intelligence. Whilst traditional hand-guided feature engineering in many cases is simplified, the deep network architectures become increasingly more complex. A central question is if we can distill the minimal set of structural priors that can provide us the maximal flexibility and lead us to richer sets of structural primitives that potentially lay the foundations towards the ultimate goal of building general intelligent systems. In this talk I will introduce my Ph.D. work along the aforementioned direction. I will show how we can tackle different real world problems, with carefully designed architectures, guided by simple yet effective structural priors. In particular, I will focus on two structural priors that have proven to be useful in many different scenarios: the multi-scale prior and the sparse-connectivity prior. will also show examples of learning structural priors from data, instead of hard-wiring them.

      Less More
    • April 20, 2018

      Kyle Richardson

      In this talk, I will give an overview of research being done at the University of Stuttgart on semantic parser induction and natural language understanding. The main topic, semantic parser induction, relates to the problem of learning to map input text to full meaning representations from parallel datasets. Such resulting “semantic parsers” are often a core component in various downstream natural language understanding applications, including automated question-answering and generation systems. We look at learning within several novel domains and datasets being developed in Stuttgart (e.g., software documentation for text-to-code translation) and under various types of data supervision (e.g., learning from entailment, "polyglot" modeling, or learning from multiple datasets).

      Less More
    • April 10, 2018

      Jesse Dodge

      Driven by the need for parallelizable hyperparameter optimization methods, we study open loop search methods: sequences that are predetermined and can be generated before a single configuration is evaluated. Examples include grid search, uniform random search, low discrepancy sequences, and other sampling distributions. In particular, we propose the use of k-determinantal point processes in hyperparameter optimization via random search. Compared to conventional uniform random search where hyperparameter settings are sampled independently, a k-DPP promotes diversity. We describe an approach that transforms hyperparameter search spaces for efficient use with a k-DPP. In addition, we introduce a novel Metropolis-Hastings algorithm which can sample from k-DPPs defined over any space from which uniform samples can be drawn, including spaces with a mixture of discrete and continuous dimensions or tree structure. Our experiments show significant benefits when tuning hyperparameters to neural models for text classification, with a limited budget for training supervised learners, whether in serial or parallel.

      Less More
    • April 2, 2018

      Rama Vedantam

      Understanding how to model vision and language jointly is a long-standing challenge in artificial intelligence. Vision is one of the primary sensors we use to perceive the world, while language is our data structure to represent and communicate knowledge. In this talk, we will take up three lines of attack to this problem: interpretation, grounding, and imagination. In interpretation, the goal will be to get machine learning models to understand an image and describe its contents using natural language in a contextually relevant manner. In grounding, we will connect natural language to referents in the physical world, and show how this can help learn common sense. Finally, we will study how to ‘imagine’ visual concepts completely and accurately across the full range and (potentially unseen) compositions of their visual attributes. We will study these problems from computational as well as algorithmic perspectives and suggest exciting directions for future work.

      Less More
    • March 30, 2018

      Keisuke Sakaguchi

      Robustness has always been a desirable property for natural language processing. In many cases, NLP models (e.g., parsing) and downstream applications (e.g., MT) perform poorly when the input contains noise such as spelling errors, grammatical errors, and disfluency. In this talk, I will present three recent results on error correction models: character, word, and sentence level respectively. For character level, I propose semi-character recurrent neural network, which is motivated by a finding in Psycholinguistics, called Cmabrigde Uinervtisy (Cambridge University) effect. For word-level robustness, I propose an error-repair dependency parsing algorithm for ungrammatical texts. The algorithm can parse sentences and correct grammatical errors simultaneously. Finally, I propose a neural encoder-decoder model with reinforcement learning for sentence-level error correction. To avoid exposure bias in standard encoder-decoders, the model directly optimizes towards a metric for grammatical error correction performance.

      Less More
    • March 28, 2018

      Arun Chaganty

      A significant challenge in developing systems for tasks such as knowledge base population, text summarization or question answering is simply evaluating their performance: existing fully-automatic evaluation techniques rely on an incomplete set of “gold” annotations that can not adequately cover the range of possible outputs of such systems and lead to systematic biases against many genuinely useful system improvements. In this talk, I’ll present our work on how we can eliminate this bias by incorporating on-demand human feedback without incurring the full cost of human evaluation. Our key technical innovation is the design of good statistical estimators that are able to tradeoff cost for variance reduction. We hope that our work will enable the development of better NLP systems by making unbiased natural language evaluation practical and easy to use.

      Less More
    • March 26, 2018

      Chenyan Xiong

      Search engines and other information systems have started to evolve from retrieving documents to providing more intelligent information access. However, the evolution is still in its infancy due to computers’ limited ability in representing and understanding human language. This talk will present my work addressing these challenges with knowledge graphs. The first part is about utilizing entities from knowledge graphs to improve search. I will discuss how we build better text representations with entities and how the entity-based text representations improve text retrieval. The second part is about better text understanding through modeling entity salience (importance), as well as how the improved text understanding helps search under both feature-based and neural ranking settings. This talk concludes with future directions towards the next generation of intelligent information systems.

      Less More
    • March 7, 2018

      Yonatan Belinkov

      Language technology has become pervasive in everyday life, powering applications like Apple’s Siri or Google’s Assistant. Neural networks are a key component in these systems thanks to their ability to model large amounts of data. Contrary to traditional systems, models based on deep neural networks (a.k.a. deep learning) can be trained in an end-to-end fashion on input-output pairs, such as a sentence in one language and its translation in another language, or a speech utterance and its transcription. The end-to-end training paradigm simplifies the engineering process while giving the model flexibility to optimize for the desired task. This, however, often comes at the expense of model interpretability: understanding the role of different parts of the deep neural network is difficult, and such models are often perceived as “black-box”. In this work, I study deep learning models for two core language technology tasks: machine translation and speech recognition. I advocate an approach that attempts to decode the information encoded in such models while they are being trained. I perform a range of experiments comparing different modules, layers, and representations in the end-to-end models. The analyses illuminate the inner workings of end-to-end machine translation and speech recognition systems, explain how they capture different language properties, and suggest potential directions for improving them. The methodology is also applicable to other tasks in the language domain and beyond.

      Less More
    • March 2, 2018

      Peter Jansen

      Modern question answering systems are able to provide answers to a set of common natural language questions, but their ability to answer complex questions, or provide compelling explanations or justifications for why their answers are correct is still quite limited. These limitations are major barriers in high-impact domains like science and medicine, where the cost of making errors is high, and user trust is paramount. In this talk I'll discuss our recent work in developing systems that can build explanations to answer questions by aggregating information from multiple sources (sometimes called multi-hop inference). Aggregating information is challenging, particularly as the amount of information becomes large due to "semantic drift", or the tendency for inference algorithms to quickly move off-topic when assembling long chains of knowledge. Motivated by our earlier efforts in attempting to latently learn information aggregation for explanation generation (which is currently limited to short inference chains), I will discuss our current efforts to build a large corpus of detailed explanations expressed as lexically-connected explanation graphs to serve as training data for the multi-hop inference task. We will discuss characterizing what's in a science exam explanation, difficulties and methods for large-scale construction of detailed explanation graphs, and the possibility of automatically extracting common explanatory patterns from corpora such as this to support building large explanations (i.e. six or more aggregated facts) for unseen questions through merging, adapting, and adding to known explanatory patterns.

      Less More
    • February 27, 2018

      Rob Speer and Catherine Havasi

      We are the developers of ConceptNet, a long-running knowledge representation project that originated from crowdsourcing. We demonstrate systems that we’ve made by adding the common knowledge in ConceptNet to current techniques in distributional semantics. This produces word embeddings that are state-of-the-art at semantic similarity in multiple languages, analogies that perform like a moderately-educated human on the SATs, the ability to find relevant distinctions between similar words, and the ability to propose new knowledge-graph edges and “sanity check” them against existing knowledge.

      Less More
    • February 26, 2018

      Luheng He

      Semantic role labeling (SRL) systems aim to recover the predicate-argument structure of a sentence, to determine “who did what to whom”, “when”, and “where”. In this talk, I will describe my recent SRL work showing that relatively simple and general purpose neural architectures can lead to significant performance gains, including a over 40% error reduction over long-standing pre-neural performance levels. These approaches are relatively simple because they process the text in an end-to-end manner, without relying on the typical NLP pipeline (e.g. POS-tagging or syntactic parsing). They are general purpose because, with only slight modifications, they can be used to learn state-of-the-art models for related semantics problems. The final architecture I will present, which we call Labeled Span Graph Networks (LSGNs), opens up exciting opportunities to build a single, unified model for end-to-end, document-level semantic analysis.

      Less More
    • February 12, 2018

      Richard Zhang

      We explore the use of deep networks for image synthesis, both as a graphics goal and as an effective method for representation learning. We propose BicycleGAN, a general system for image-to-image translation problems, with the specific aim of capturing the multimodal nature of the output space. We study image colorization in greater detail and develop automatic and user-guided approaches. Moreover, colorization, as well as cross-channel prediction in general, is a simple but powerful pretext task for self-supervised feature learning. Not only does the network solve the direct graphics task, it also learns to capture patterns in the visual world, even without the benefit of human-curated labels. We demonstrate strong transfer to high-level semantic tasks, such as image classification, and to low-level human perceptual judgments. For the latter, we collect a large-scale dataset of human similarity judgments and find that our method outperforms traditional metrics such as PSNR and SSIM. We also discover that many unsupervised and self-supervised methods transfer strongly, even comparable to fully-supervised methods.

      Less More
    • January 17, 2018

      Alexander Rush

      Early successes in deep generative models of images have demonstrated the potential of using latent representations to disentangle structural elements. These techniques have, so far, been less useful for learning representations of discrete objects such as sentences. In this talk I will discuss two works on learning different types of latent structure: Structured Attention Networks, a model for learning a soft-latent approximation of the discrete structures such as segmentations, parse trees, and chained decisions; and Adversarially Regularized Autoencoders, a new GAN-based autoencoder for learning continuous representations of sentences with applications to textual style transfer. I will end by discussing an empirical analysis of some issues that make latent structure discovery of text difficult.

      Less More