Frontiers of Data and Computing ›› 2025, Vol. 7 ›› Issue (1): 186-202.

CSTR: 32002.14.jfdc.CN10-1649/TP.2025.01.014

doi: 10.11871/jfdc.issn.2096-742X.2025.01.014

• Technology and Application • Previous Articles    

Influence of Incomplete Landslide Data on Susceptibility Modeling and Suggestions for Improvement

HE Yifei1,2,3(),ZHANG Yaonan1,3,4,*()   

  1. 1. Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou, Gansu 730000, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. National Cryosphere Desert Data Center, Lanzhou, Gansu 730000, China
    4. Gansu Data Engineering and Technology Research Center for Resources and Environment, Lanzhou, Gansu 730000, China
  • Received:2023-03-27 Online:2025-02-20 Published:2025-02-21

Abstract:

[Objective] China is one of the countries with the most frequent landslide disasters in the world. It is necessary to establish a reliable landslide susceptibility model suitable for the whole country to determine the areas with high landslide hazards, formulate appropriate disaster prevention and reduction strategies, and reduce the loss of people's lives and property. [Methods] Given the difficulty in obtaining completely unbiased landslide data in such a large area of China, this study selected 10 influencing factors such as slope, aspect, profile curvature, plan curvature, road density, river density, soil moisture, lithology, land use, and geological environment division as the driving data and designed Model Scheme 1 (Based on LightGBM and ignoring the effects of incomplete landslide data), Model Scheme 2 (Based on LightGBM and excluding factors associated with landslide incompleteness) and Model Scheme 3 (Based on TBMM and including the variables describing landslide incompleteness, i.e. land use and geological environment division, as random effect terms) to assess landslide susceptibility separately to explore the impact of incomplete landslide data on the modelling of landslide susceptibility in China, the impact of incomplete landslide data on the modelling of landslide susceptibility in China, and the measure to counteract the effect of such bias. [Results] The results show that, in the context of large regional susceptibility modeling, (1) although the model schemes that simply ignore or exclude the factors associated with existing landslide data deficiencies have higher statistical performance, they will lead to geomorphically incoherent landslide susceptibility prediction results; (2) the mixed effects model can effectively reduce the bias impact caused by incomplete landslide data. [Conclusions] This study provides a new idea for landslide susceptibility mapping under the background of incomplete landslide data and contributes to assessing China’s overall mass movement susceptibility situation and assisting policymakers in master planning for risk mitigation.

Key words: landslide, susceptibility map, inventory bias, LightGBM, TBMM