Research - Papers
Explore a selection of our published work on a variety of key research challenges in AI.
Semi-supervised sequence tagging with bidirectional language models
Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates…
AI zooms in on highly influential citations
The number of times a paper is cited is a poor proxy for its impact (see P. Stephan et al. Nature 544, 411–412; 2017). I suggest relying instead on a new metric that uses artificial intelligence…
End-to-End Neural Ad-hoc Ranking with Kernel Pooling
This paper proposes K-NRM, a kernel based neural model for document ranking. Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word…
Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding
This paper introduces Explicit Semantic Ranking (ESR), a new ranking technique that leverages knowledge graph embedding. Analysis of the query log from our academic search engine,…
Learning to Predict Citation-Based Impact Measures
Citations implicitly encode a community's judgment of a paper's importance and thus provide a unique signal by which to study scientific impact. Efforts in understanding and refining this signal are…
Ontology Aware Token Embeddings for Prepositional Phrase Attachment
Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language. Instead, we embed…
The AI2 system at SemEval-2017 Task 10 (ScienceIE): semi-supervised end-to-end entity and relation extraction
This paper describes our submission for the ScienceIE shared task (SemEval-2017 Task 10) on entity and relation extraction from scientific papers. Our model is based on the end-to-end relation…
PDFFigures 2.0: Mining Figures from Research Papers
Figures and tables are key sources of information in many scholarly documents. However, current academic search engines do not make use of figures and tables when semantically parsing documents or…
Toward Automatic Bootstrapping of Online Communities Using Decision-theoretic Optimization
Successful online communities (e.g., Wikipedia, Yelp, and StackOverflow) can produce valuable content. However, many communities fail in their initial stages. Starting an online community is…
Identifying Meaningful Citations
We introduce the novel task of identifying important citations in scholarly literature, i.e., citations that indicate that the cited work is used or extended in the new effort. We believe this task…