近年来,随着机器语义理解需求的日益增长,知识图谱,即各类实体、概念及其之间的语义关系,日益成为大数据时代知识表示的主要形态之一。知识图谱为语义理解提供了丰富的背景知识,是实现下一代智能信息处理与机器智脑的关键核心基础技术。知识图谱已经在智慧搜索、文本理解等领域发挥巨大价值。 近期伴随着数据处理能力以及机器学习能力的进一步提高,以及知识图谱应用的进一步深化,知识图谱技术面临全新机遇。 同时,知识图谱的海量规模、异构来源、多模形态和复杂结构等特性也对知识图谱技术的发展提出了全新的挑战。
本届知识图谱前沿技术研讨会,我们邀请到了十多位国内外顶尖学者做特邀报告。本研讨会旨在集中展示知识图谱的当前在学术界和工业界的进展,讨论现有主要问题,为下一步知识图谱方向的研究工作做好规划。欢迎广大师生、研究人员参与。
In recent years, we have witnessed the prosperity of a variety of knowledge graphs (KG) that contain various entities, concepts as well as the semantic relationships among them. The ever-increasing demand for semantic understanding of data drove the continuous research progress about KGs. KGs now have become one of the prominent representations of knowledge in big data era. KGs provide rich background knowledge for semantic understanding and consequently become the underlying technique in the next generation intelligent information processing and building smart machine brain. KGs have been successfully used in many real applications such as smart search and text understanding. Recently, with the enhanced ability of machines to learn from data and manage big data, many new opportunities to use KGs are emerging. Meanwhile, the research of KG also faces new challenges such as the web-scale, complicated structure, heterogeneous modeling of KGs.
In this workshop, we will invite more than 10 leading researchers around the world to give talks. This workshop aims to introduce the current progress of knowledge graph in both academia and industry, discuss the major research challenges, and design a future plan for knowledge graph research. We welcome all researchers and students to attend the workshop.
The deep learning tsunami continues to take over NLP. In many cases, by replacing hand-crafted and heavily engineered traditional NLP methods, it reduces development cost and achieves performance gain. In this talk, I will discuss deep learning approach for parsing, and describe its potential for handling many other text processing tasks.
Haixun Wang is a research scientist / Engineering manager at Facebook. Before Facebook, he is with Google Research, working on natural language processing. He led research in semantic search, graph data processing systems, and distributed query processing at Microsoft Research Asia. He had been a research staff member at IBM T. J. Watson Research Center from 2000 - 2009. He was Technical Assistant to Stuart Feldman (Vice President of Computer Science of IBM Research) from 2006 to 2007, and Technical Assistant to Mark Wegman (Head of Computer Science of IBM Research) from 2007 to 2009. He received the Ph.D. degree in computer science from the University of California, Los Angeles in 2000. He has published more than 150 research papers in referred international journals and conference proceedings. He served PC Chair of conferences such as CIKM’12 and he is on the editorial board of IEEE Transactions of Knowledge and Data Engineering (TKDE), and Journal of Computer Science and Technology (JCST). He won the best paper award in ICDE 2015, 10 year best paper award in ICDM 2013, and best paper award of ER 2009.
Knowledge graph has been lately considered as a machine intelligence platform for Web search and software in general. In this talk, I will present my recent research to automatically harvest, maintain,and integrate (spatial) data intelligence from the Web.
Seung-won Hwang is a Professor of Computer Science at Yonsei University. Prior to joining Yonsei, she had been an Associate Professor at POSTECH for 10 years, after her PhD in Computer Science from UIUC. Her recent research interest has been data(-driven) intelligence, led to 100+ publication at top-tier database/mining, AI, and NLP venues, including ACL, AAAI, SIGMOD, VLDB, ICDE, and WSDM (best paper)
Many large-scale machine-readable entity knowledge bases have emerged in recent years and have been shown very useful in building semantic search and deep Q/A systems. As an important tool to enrich the knowledge bases, entity linking can link entity mentions appearing in the Web text with their corresponding mapping entities in a knowledge base and has many applications in the fields of content analysis and business intelligence beyond knowledge base population. However, this task is challenging due to name variations and entity ambiguity. In this talk, we will introduce some of our efforts on entity linking for heterogeneous data, including entity linking for unstructured Web free text, entity linking for structured Web lists and Web tables, and entity linking for tweets, and discuss its various applications too.
Jianyong Wang is currently a professor in the Department of Computer Science and Technology, Tsinghua University, Beijing, China. He received the PhD degree in computer science in 1999 from the Institute of Computing Technology, Chinese Academy of Sciences. He was ever an assistant professor at Peking University, and visited Simon Fraser University, University of Illinois at Urbana-Champaign, and University of Minnesota at Twin Cities before joining Tsinghua University. His research interests mainly include data mining and Web information management. He has co-authored over 70 papers in some leading international conferences and journals. He ever served as a PC Co-Chair for WISE’15, BioMedCom'14, WAIM'13, ADMA'11, and NDBC'10, and is an associate editor of IEEE TKDE.
Verbs play an important role in the understanding of natural language text. This paper studies the problem of abstracting the subject and object arguments of a verb into a set of noun concepts, known as the “argument concepts”. This set of concepts, whose size is parameterized, represents the fine-grained semantic of a verb. For example, the object of “enjoy” can be abstracted into time, hobby and event, etc. We present a novel framework to automatically infer human readable and machine computable action concepts with high accuracy.
Kenny Qili Zhu is the Distinguished Research Professor (PhD advisor) at Department of Computer Science and Engineering of Shanghai Jiao Tong University. He graduated with B.Eng (Hons) in Electrical Engineering in 1999 and PhD in Computer Science in 2005 from National University of Singapore. He was a postdoctoral researcher and lecturer from 2007 to 2009 at Princeton University. Prior to that, he was a software design engineer at Microsoft, Redmond, WA. From Feb 2010 to Aug 2010, he was a visiting professor at Microsoft Research Asia in Beijing. Kenny's main research interests are data and knowledge engineering and programming languages. He has published extensively in databases, AI and programming languages at top venues. He has served on the PC of WWW, CIKM, ECML, COLING, SAC, WAIM, APLAS and NDBC, etc. His research has been supported by NSF China, MOE China, Microsoft, Google, Oracle, Morgan Stanley and AstraZeneca. Kenny is the winner of the 2013 Google Faculty Research Award and 2014 DASFAA Best Paper Award.
We propose techniques for processing SPARQL queries over a large RDF graph in a distributed environment. We adopt a “partial evaluation and assembly” framework. Answering a SPARQL query Q is equivalent to finding subgraph matches of the query graph Q over RDF graph G. Based on properties of subgraph matching over a distributed graph, we introduce local partial match as partial answers in each fragment of RDF graph G. For assembly, we propose two methods: centralized and distributed assembly. We analyze our algorithms from both theoretically and experimentally. Extensive experiments over both real and benchmark RDF repositories of billions of triples in a cluster of 10-50 machines confirm that our method is superior to the state-of-the-art methods in both the system’s performance and scalability
Lei Zou received his BS degree and Ph.D. degree in Computer Science at Huazhong University of Science and Technology (HUST) in 2003 and 2009, respectively. He received a CCF (China Computer Federation) Doctoral Dissertation Nomination Award in 2009 and won Second Class Prize of CCF Natural Science Award in 2014. Since September 2009, he joined Institute of Computer Science and Technology (ICST) of Peking University (PKU) as a faculty member. He has been an associate professor in PKU since August 2012. Before joining PKU, he visited Hong Kong University of Science and Technology (HKUST) and University of Waterloo (UW) during 2007 and 2008, respectively. His recent research interests include graph databases, RDF knowledge graph, particularly in graph-based RDF data management. He has published more than 30 papers, including more than 15 papers published in reputed journals and major international conferences, such as SIGMOD, VLDB, ICDE, TKDE, VLDB Journal. His personal homepage is at http://www.icst.pku.edu.cn/intro/leizou/index.html.
The acquisition of knowledge becomes scalable. The machine-readable knowledge keeps its pace with the phenomenal “Big Data” era. On the one hand, we have a revolutionary way of piling up knowledge; on the other hand, the technology of making the knowledge graph accessible, i.e. how to serve the knowledge to support real-life applications, evolves slowly. Due to the great connectedness, knowledge data by its very nature is a complex entity graph with rich schemata. This talk presents our efforts of serving real-world knowledge graphs for real-time query processing at scale.
Bin Shao is a lead researcher at Microsoft Research (Beijing, China). He joined Microsoft after receiving his Ph.D. degree from Fudan University in July 2010. Bin Shao is the architect and a core developer of Microsoft Graph Engine, which is a distributed, in-memory, large graph processing engine. His research interests include in-memory databases, distributed systems, graph query processing, and concurrency control algorithms.
Integrating, representing, and reasoning over human knowledge is a computational grand challenge for the 21st century. Currently, most IR approaches are keyword-based statistical approaches. When the input is sparse, noisy, and ambiguous, knowledge is needed to fill the gap. In this talk, I will focus on knowledge powered short text understanding. I will introduce the Probase project at Microsoft Research, whose goal is to enable machines to understand human communications. Probase is a universal, probabilistic semantic network. It contains millions of concepts, harnessed automatically from a corpus of billions of web pages. It enables probabilistic interpretations of search queries, document titles, ad keywords, etc. The probabilistic nature also enables it to incorporate heterogeneous information naturally. I will introduce the core technique called Conceptualization we develop on this probabilistic semantic network. The goal of conceptualization is to infer concepts in the text. I will show how we leverage conceptualization to improve current web search, ads matching, query recommendation, etc.
Dr. Zhongyuan Wang is a Researcher at Microsoft Research Asia (MSRA). He leads two projects at MSRA: Enterprise Dictionary (knowledge mining from Enterprise) and Probase (knowledge mining from Web). He received his master’s degree and bachelor's degree in computer science at Renmin University in 2010 and 2007 respectively. Zhongyuan Wang won Wu Yuzhang Scholarship (Top-level Scholarship at Renmin University), Kwang-Hua Scholarship, and ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world) in the university. After he graduated from RUC, he joined MSRA as a Research Software Development Engineer, and then became an Associate Researcher. Until now, Zhongyuan Wang has published 10+ papers (including ICDE 2015 Best Paper) in the leading international conferences, such as VLDB, ICDE, CIKM, etc. He is also the translator of the book “Windows Phone 7 Programming for Android and iOS Developers”, published in 2012, and the co-author of the book “Web Data Management: Concepts and Techniques”, published in 2014. He guided 30+ interns, who got PhD offers from Harvard, Yale, CMU, UW, etc. His research interests include knowledge base, web data mining, semantic network, machine learning, and natural language processing.
Large-scale knowledge graphs (KGs) contain rich entities and abundant relationships among the entities. Data exploration over KGs allow users to browse the attributes of entities as well as the relations among entities. It therefore provides a good way of learning the structure and coverage of KGs. In this talk, we introduce a system called SEED that is designed to support entity-oriented exploration in large-scale KGs, based on retrieving similar entities of some seed entities as well as their semantic relations that show how entities are similar to each other. A by-product of entity exploration in SEED is to facilitate discovering the deficiency of KGs, so that the detected bugs can be easily fixed by users as they exploring the KGs.
陈跃国,博士,中国人民大学副教授,博士生导师,中国计算机学会大数据专家委员会通讯委员、YOSCEF委员。2009年博士毕业于新加坡国立大学。目前研究方向是大数据实时分析系统和知识图的探索式搜索。在TKDE、ICDE、AAAI、EDBT、CIKM等国内外学术期刊和学术会议上发表论文20余篇。先后承担了国家自然科学基金青年项目和面上项目各一项, 广东省重大科技项目《高通量大数据实时商业智能系统产业化实现》,中国人民大学团队预研项目《面向社会化服务的大数据管理关键技术研究》。组织了WISE2013实体标注比赛、搜狗-中国数据库年会知识抽取比赛、2015中国青年大数据创新大赛。担任VLDBJ, TKDE, ICDE, WWW, CIKM等期刊和会议的审稿,FCS青年编委。
知识图谱描述了世间众多实体之间的关系,由于其涉及的本体众多,因此,采用自动化手段从互联网析取实体与关系是知识图谱构建的主要方式。然而由于自然语言处理和人工智能等技术的限制,自动挖掘出的关系可信度不高,从而导致在知识图谱之上的查询存在不确定性,质量不高,严重影响知识图谱的可用性。适当采用人工干预,可以消除图谱中的不确定性,弥补人工智能的不足。考虑到清洗的经济性,我们将不确定的图谱关系交予众包清洗,因此带来了诸多问题,如清洗关系选择、众包结果判定和动态激励机制等。
男,博士,出生于1981年7月, 现担任华东师范大学信息科学技术学院副教授。 目前主要致力于新型数据管理研究。先后在该领域发表论文30余篇,其中近三年在中国计算机学会推荐的A类顶级期刊TKDE和A类会议ICDE发表论文4篇。2011年 入选首批“香江学者计划”,赴香港浸会大学从事为期2年的访问研究。2014年回国后入选上海市“浦江人才计划”。现担任SCI杂志《Frontier of Computer Science》青年副主编,担任TKDE、TPDS等权威学术期刊的审稿人,并多次担任WAIM,ICPADS等国际会议的PC member。
随着2012年谷歌提出知识图谱的概念并将其成功应用于Web搜索,知识图谱和语义技术正受到越来越多的学术界和工业界的重视。其中,如何将来自不同源的异构知识融合在一起形成更完整的知识库已称为研究热点。本次报告将分享我在构建知识图谱(尤其是中文知识图谱)过程中所面对的融合挑战和相应的解决方法。具体来说,我将介绍Zhishi.me(第一份中文开放数据)中涉及的实体匹配(Instance matching),在线知识图谱高效聚合(online knowledge graph aggregation),和跨语言的模式映射(cross-lingual schema mapping)等多个融合算法。
王昊奋,2013年从上海交通大学获得工学博士学位,目前担任华东理工大学讲师。他同时担任计算机技术研究所所长助理和自然语言处理与大数据挖掘研究室副主任等职务。王昊奋在语义技术和图数据管理方面有比较丰富的经验和积累,共发表40余篇高水平论文,其中包括20余篇CCF A类和B类论文。作为技术负责人,他带领团队构建的语义搜索系统在十亿三元组挑战赛(Billion Triple Challenge)中获得全球第2名的好成绩;在著名的本体匹配竞赛OAEI的实体匹配任务中获得全球第1名的好成绩。他带领团队构建了第一份中文语义互联知识库zhishi.me,被邀请参加W3C的multilingual研讨会并做报告。此外,他还作为组织者组织了3届语义搜索研讨会(WWW Workshop SemSearch09, SemSearch10和SemSearch11)和国际语义Web顶级会议ISWC 2010,并长期作为ISWC, WWW, AAAI等国际顶级会议程序委员会委员。他还带领团队参加了百度大数据知识挖掘并连续两年获得第一名的好成绩。他主持并参与了多项国家自然科学基金、863国家项目、国家科技支撑相关项目。在就读博士期间,他连续两年获得IBM全球博士精英奖,并深入参与了IBM Watson系统的研发工作。目前,王昊奋是CCF YOCSEF上海学术委员,中文信息学会语言与知识计算委员会委员,NLPCC 2015知识图谱方向主席,并担任CCF ADL55期知识图谱讲师等社会职位。
知识表示是知识图谱的核心技术之一,而概率化知识表示是知识表示领域的一个重要发展方向。我们将发源于自然语言处理领域的随机文法扩展为一种通用的概率知识表示,用于建模多种类型的数据,譬如物体与场景的空间层次化构成、事件的时间层次化构成、矢量数据的概率分布等。我们将简要讨论如何通过非监督学习技术从未标注数据(譬如文本和图像)中构建随机文法
屠可伟博士2005年于上海交通大学获计算机科学与工程系硕士学位,2012年于美国爱荷华州立大学获计算机科学博士学位。2012至2014年,他在美国加州大学洛杉矶分校统计系从事博士后研究工作。从2014年起,他在上海科技大学信息科学与技术学院任助理教授、博士生导师。
Yanghua Xiao got his PHD degree in software theory from Fudan University, Shanghai, China, in 2009. He now is an associate professor of computer science at Fudan University. He is one of young 973 scientists. His research interest includes big data management and mining, graph database, knowledge graph. He was a visiting professor of Human Genome Sequencing Center at Baylor College Medicine, and visiting researcher of Microsoft Research Asia. He won the Best Phd Thesis Nomination of CCF (Chinese Computer Federation) in year 2010,CCF2014 Natural Science Award (second level), ACM(CCF) Shanghai distinguished young scientists nomination award. Recently, he has published more than 50 papers in international leading journals and top conferences, including Pattern Recognition, Physical Review E, Plos One, Information System, Computers & Mathematics with Applications, BCM Systems Biology, Physica A; SIGMOD,VLDB, ICDE, IJCAI, AAAI, EDBT, ICSE, OOPSLA, WWW, ICDM, SDM, ECML/PKDD, ICWS, ICSM, CIKM, DASFAA, COMPSAC, SSDBM, etc. He is the PI or Co-PI of more than ten projects supported by Natural Science Foundation of China, Ministry of Education of China, Shanghai Municipal Science and Technology Commission, Microsoft, IBM, China Telecom, Baidu etc. He regularly serves as the reviewer of Natural Science Foundation of China, PC member of IJCAI, SIGKDD, ICDE, WWW, CIKM, ICDM, COLING,WAIM, GDM, etc, Associate Editor of Frontier of Computer Science, and reviewers of leading journals such as Plos One, IEEE Tansaction on Computers, TKDE, KIS, WWW Journal, JCST, Physica A, IEEE Intelligent System, BMC Bioinformatic, Distributed and Parallel Database etc. He is a member of ACM, IEEE and senior member of CCF. He is the director of GDM@FUDAN.
联系方式:18018508381
复旦大学计算机科学技术学院教授、副院长。长期从事海量数据的管理和挖掘技术的研究和开发工作。研究方向是大数据管理域分析技术。在国内外权威刊物和会议上发表论文100余篇,引用500余次。作为负责人或主要研究人员先后承担了国家重大专项子课题、国家自然科学基金项目(面上和重点项目)、国家973计划、国家863计划、总装预研等40多个课题。
从虹桥机场前往会场 :搭乘轨道交通二号线,到张江地铁站下车,步行或乘坐浦东22路、25路或乘坐有轨电车到达。
从浦东机场前往会场 :搭乘轨道交通二号线,到张江地铁站下车,步行或乘坐浦东22路、25路或乘坐有轨电车到达。
从上海火车站前往会场 :搭乘轨道交通一号线,到人民广场站换乘轨道交通二号线,到张江地铁站下车,步行或乘坐浦东22路、25路或乘坐有轨电车到达。
从虹桥火车站前往会场 :搭乘轨道交通二号线,到张江地铁站下车,步行或乘坐浦东22路、25路或乘坐有轨电车到达。
请点击本链接注册,填写个人信息,感谢您的关注,若有任何问题,请联系 liangbin@fudan.edu.cn(13524401312)
本次研讨会免收注册费,与会者住宿交通费用自理。研讨会设有晚宴,仅限专业组委员出席。因场地有限,若注册人数超限,超出的非专业组委员的注册申请将交由委员会审核。
国家自然科学基金
科技部青年973
上海市科委基础重点基金