Viewing 11-20 of 35 papers
  • Transfer Learning Between Related Tasks Using Expected Label Proportions

    Matan Ben Noach, Yoav GoldbergEMNLP2019Deep learning systems thrive on abundance of labeled training data but such data is not always available, calling for alternative methods of supervision. One such method is expectation regularization (XR) (Mann and McCallum, 2007), where models are trained based on expected label proportions. We… more
  • Question Answering is a Format; When is it Useful?

    Matt Gardner, Jonathan Berant, Hannaneh Hajishirzi, Alon Talmor, Sewon MinarXiv2019Recent years have seen a dramatic expansion of tasks and datasets posed as question answering, from reading comprehension, semantic role labeling, and even machine translation, to image and video understanding. With this expansion, there are many differing views on the utility and definition of… more
  • Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets

    Mor Geva, Yoav Goldberg, Jonathan BerantarXiv2019Crowdsourcing has been the prevalent paradigm for creating natural language understanding datasets in recent years. A common crowdsourcing practice is to recruit a small number of high-quality workers, and have them massively generate examples. Having only a few workers generate the majority of… more
  • MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension

    Alon Talmor, Jonathan BerantACL2019A large number of reading comprehension (RC) datasets has been created recently, but little analysis has been done on whether they generalize to one another, and the extent to which existing datasets can be leveraged for improving performance on new ones. In this paper, we conduct such an… more
  • Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

    Ben Bogin, Jonathan Berant, Matt GardnerACL2019Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time. In SPIDER, a recently-released text-to-SQL dataset, new and complex DBs are given at test time, and so… more
  • Aligning Vector-spaces with Noisy Supervised Lexicons

    Noa Yehezkel, Jacob Goldberger, Yoav GoldbergNAACL2019The problem of learning to translate between two vector spaces given a set of aligned points arises in several application areas of NLP. Current solutions assume that the lexicon which defines the alignment pairs is noise-free. We consider the case where the set of aligned points is allowed to… more
  • CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

    Alon Talmor, Jonathan Herzig, Nicholas Lourie, Jonathan BerantNAACL2019When answering a question, people often draw upon their rich world knowledge in addition to the particular context. Recent work has focused primarily on answering questions given some relevant document or context, and required very little general background. To investigate question answering with… more
  • DiscoFuse: A Large-Scale Dataset for Discourse-based Sentence Fusion

    Mor Geva, Eric Malmi, Idan Szpektor, Jonathan BerantNAACL2019Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically-generating fusion examples from raw text and… more
  • Evaluating Text GANs as Language Models

    Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan BerantNAACL2019Generative Adversarial Networks (GANs) are a promising approach for text generation that, unlike traditional language models (LM), does not suffer from the problem of “exposure bias”. However, A major hurdle for understanding the potential of GANs for text generation is the lack of a clear… more
  • Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them

    Hila Gonen, Yoav GoldbergNAACL2019Word embeddings are widely used in NLP for a vast range of tasks. It was shown that word embeddings derived from text corpora reflect gender biases in society. This phenomenon is pervasive and consistent across different word embedding models, causing serious concern. Several recent works tackle… more