Frontiers of Data and Domputing ›› 2021, Vol. 3 ›› Issue (3): 126-135.doi: 10.11871/jfdc.issn.2096-742X.2021.03.011

• Technology and Applicaton • Previous Articles     Next Articles

Review of Genomic Microsatellite Status Detection Based on Machine Learning

ZHANG Shuying1,2(),HAN Xinyin1,2(),HE Xiaoyu1,2(),YUAN Danyang1,2(),LUAN Haijing1,2(),LI Ruilin1(),HE Jiayin1(),NIU Beifang1,2,*()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-01-21 Online:2021-06-20 Published:2021-07-09
  • Contact: NIU Beifang E-mail:zhangshuying@cnic.cn;hanxinyin@cnic.cn;hexy@sccas.cn;yuandanyang@cnic.cn;luanhaijing@cnic.cn;lirl@sccas.cn;jiayin.he@cnic.cn;niubf@cnic.cn

Abstract:

[Objective] This paper discusses the application and future research direction of machine learning in microsatellite status detection. [Scope of the literature] We collected the related literature of microsatellite status detection methods.[Methods] Firstly, the significance of microsatellite status detection and common detection methods are briefly introduced. Secondly, the current mainstream detection methods based on machine learning are introduced in detail. Finally, perspective future research direction of machine learning in the field of microsatellite status detection is presented.[Results] The detection methods based on machine learning can iteratively learn from massive sequencing data and discern key features that affect microsatellite instability. They can achieve accurate prediction results. [Limitations] The data types used by the detection methods are different, so we cannot compare the detection methods within the same dataset. [Conclusions] Machine learning has been widely used in microsatellite status detection. Improving the applicability of detection methods and detecting microsatellite status from peripheral blood samples are the future research directions of machine learning in this field.

Key words: machine learning, genome, microsatellite instability, sequencing data, key features