Knowledge Bases of Science with Representation and Reasoning through Universal Schema

Data Science Seminar Series - Andrew McCallum - U of Mass Amherst - October 25th - 2pm - Room C2010

October 25, 2017 2:00 PM  to 
October 25, 2017 3:00 PM
NSF Room C2010

Abstract:
We want to build a large-scale knowledge base of science containing entities and relations in fields such as biomedicine, material science, computer science, and STEM career paths.  Work in knowledge representation and knowledge bases has long struggled to design schemas of entity- and relation-types that capture the desired balance of specificity and generality while also supporting reasoning and information integration from various sources of input evidence.  In this talk I will describe our work in "universal schema," a deep learning approach to knowledge representation in which we operate on the union of all input schemas (from structured KBs to natural language textual patterns) while also supporting integration and generalization by learning vector embeddings whose neighborhoods capture semantic implicature.  I will also discuss our work in (a) large-scale, non-greedy clustering for entity resolution, (b) question answering with chains of reasoning, using reinforcement learning to guide the efficient search for meaningful chains, and (c) embedded vector representations of common sense, (d) applications to material science (in collaboration with Elsa Olivetti, MIT), and biomedicine.  I also hope to describe our ongoing efforts to revolutionize scientific peer review by creating systems supporting a variety of reviewing workflows, including "open peer review" and improved expertise modeling.
 
Bio:
Andrew McCallum is a Professor and Director of the Information Extraction and Synthesis Laboratory, as well as Director of Center for Data Science in the College of Information and Computer Science at University of Massachusetts Amherst. He has published over 250 papers in many areas of AI, including natural language processing, machine learning and reinforcement learning; his work has received over 50,000 citations.  He obtained his PhD from University of Rochester in 1995 with Dana Ballard and a postdoctoral fellowship from CMU with Tom Mitchell and Sebastian Thrun. In the early 2000's he was Vice President of Research and Development at at WhizBang Labs, a 170-person start-up company that used machine learning for information extraction from the Web. He is a AAAI Fellow, the recipient of the UMass Chancellor's Award for Research and Creative Activity, the UMass NSM Distinguished Research Award, the UMass Lilly Teaching Fellowship, and research awards from Google, IBM, Microsoft, Oracle, and Yahoo. He was the General Chair for the International Conference on Machine Learning (ICML) 2012, and is the now serving as Past-President of the International Machine Learning Society, as well as member of the editorial board of the Journal of Machine Learning Research. For the past twenty years, McCallum has been active in research on statistical machine learning applied to text, especially information extraction, entity resolution, social network analysis, structured prediction, semi-supervised learning, and deep neural networks for knowledge representation. His work on open peer review can be found at http://openreview.net. McCallum's web page is http://www.cs.umass.edu/~mccallum

To join the webinar, please register at:  http://www.tvworldwide.com/events/nsf/171025/

This event is part of Webinars/Webcasts.

Meeting Type
Webcast

Contacts
Vandana Janeja, email: vjaneja@nsf.gov

NSF Related Organizations
Directorate for Computer and Information Science and Engineering