Frontiers of Data and Computing ›› 2021, Vol. 3 ›› Issue (6): 81-97.

doi: 10.11871/jfdc.10-1649.2021.06.006

Previous Articles     Next Articles

Generating a Hematopoietic Stem Cell Knowledge Graph for Scientific Knowledge Discovery

HU Zhengyin1,2,*(),LIU Leilei2(),CHEN Wenjie1(),LIU Chunjiang1(),QIAN Li2,3(),SONG Yibing4   

  1. 1. Chengdu Library and Information Centre, Chinese Academy of Sciences, Chengdu, Sichuan 610041, China
    2. Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
    3. National Science Library, Chinese Academy of Sciences, Beijing 100190, China
    4. Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, Guangdong 510530, China
  • Received:2021-11-10 Online:2021-12-20 Published:2022-01-26
  • Contact: HU Zhengyin E-mail:huzy@clas.ac.cn;liuleilei@mail.las.ac.cn;chenwj@clas.ac.cn;liucj@clas.ac.cn;qianl@mail.las.ac.cn

Abstract:

[Objective] The hematopoietic stem cell (HSC) is one kind of the most effective stem cells for clinical treatments. It is of great significance to discover important knowledge entities, knowledge relations, and knowledge paths by literature mining for HSC knowledge discovery. Knowledge graph (KG), which represents knowledge entities and their relations with more details in a simple manner is widely used in scientific knowledge discovery (SKD).[Methods] This paper proposes a framework of generating KG using Subject-Predicate-Object (SPO) triples from literature, which includes six processes: literature retrieval, SPO extracting, SPO cleanup, SPO ranking, discovery pattern integrating, and graph building. Then, an HSC KG was constructed based on the Neo4j graph database following the framework. Finally, three kinds of SKD scenarios using HSC KG are introduced by empirical analysis. [Results] The results show that HSC KG has the advantages of “using graph data structure”, “integrating discovery patterns”, “fusing native graph mining algorithms”, and “easy to use”, which can effectively support deep open discovery, close discovery, and topic discovery in HSC.

Key words: knowledge graph, SPO triple, scientific knowledge discovery, literature mining, hematopoietic stem cell