Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models

Cheng Chen; Lei Fan

doi:10.1007/s00477-023-02556-4

Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models

Cheng Chen, Lei Fan^*

^*Corresponding author for this work

Department of Civil Engineering

Research output: Contribution to journal › Article › peer-review

20 Citations (Scopus)

Abstract

Landslides are a common natural disaster that can cause casualties, property safety threats and economic losses. Therefore, it is important to understand or predict the probability of landslide occurrence at potentially risky sites. A commonly used means is to carry out a landslide susceptibility assessment based on a landslide inventory and a set of landslide contributing factors. This can be readily achieved using machine learning (ML) models such as logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (Xgboost), or deep learning (DL) models such as convolutional neural network (CNN) and long short time memory (LSTM). As the input data for these models, landslide contributing factors have varying influences on landslide occurrence. Therefore, it is logically feasible to select more important contributing factors and eliminate less relevant ones, with the aim of increasing the prediction accuracy of these models. However, selecting more important factors is still a challenging task and there is no generally accepted method. Furthermore, the effects of factor selection using various methods on the prediction accuracy of ML and DL models are unclear. In this study, the impact of the selection of contributing factors on the accuracy of landslide susceptibility predictions using ML and DL models was investigated. Four methods for selecting contributing factors were considered for all the aforementioned ML and DL models, which included Information Gain Ratio (IGR), Recursive Feature Elimination (RFE), Particle Swarm Optimization (PSO), Least Absolute Shrinkage and Selection Operators (LASSO) and Harris Hawk Optimization (HHO). In addition, autoencoder-based factor selection methods for DL models were also investigated. To assess their performances, an exhaustive approach was adopted, testing all possible selection cases of contributing factors, the results of which served as the benchmark. The results confirmed that using more important contributing factors improved the prediction accuracy of the ML models considered. However, it was interesting to find that the selection of contributing factors using IGR and RFE reduced the predictive accuracy of the DL models. For the DL models, using an autoencoder architecture improved their prediction performance. The study also found that the choice of factor selection methods was far more effective than the selection of contributing factors in improving the prediction accuracy of landslide susceptibility.

Original language	English
Journal	Stochastic Environmental Research and Risk Assessment
DOIs	https://doi.org/10.1007/s00477-023-02556-4
Publication status	Published - 13 Sept 2023

Keywords

Contributing factors
Deep learning
Factor selection
Landslide
Landslide susceptibility
Machine learning
Prediction

Access to Document

10.1007/s00477-023-02556-4

Cite this

@article{ffeafaf67a7a45e8b6f732eb41655967,

title = "Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models",

abstract = "Landslides are a common natural disaster that can cause casualties, property safety threats and economic losses. Therefore, it is important to understand or predict the probability of landslide occurrence at potentially risky sites. A commonly used means is to carry out a landslide susceptibility assessment based on a landslide inventory and a set of landslide contributing factors. This can be readily achieved using machine learning (ML) models such as logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (Xgboost), or deep learning (DL) models such as convolutional neural network (CNN) and long short time memory (LSTM). As the input data for these models, landslide contributing factors have varying influences on landslide occurrence. Therefore, it is logically feasible to select more important contributing factors and eliminate less relevant ones, with the aim of increasing the prediction accuracy of these models. However, selecting more important factors is still a challenging task and there is no generally accepted method. Furthermore, the effects of factor selection using various methods on the prediction accuracy of ML and DL models are unclear. In this study, the impact of the selection of contributing factors on the accuracy of landslide susceptibility predictions using ML and DL models was investigated. Four methods for selecting contributing factors were considered for all the aforementioned ML and DL models, which included Information Gain Ratio (IGR), Recursive Feature Elimination (RFE), Particle Swarm Optimization (PSO), Least Absolute Shrinkage and Selection Operators (LASSO) and Harris Hawk Optimization (HHO). In addition, autoencoder-based factor selection methods for DL models were also investigated. To assess their performances, an exhaustive approach was adopted, testing all possible selection cases of contributing factors, the results of which served as the benchmark. The results confirmed that using more important contributing factors improved the prediction accuracy of the ML models considered. However, it was interesting to find that the selection of contributing factors using IGR and RFE reduced the predictive accuracy of the DL models. For the DL models, using an autoencoder architecture improved their prediction performance. The study also found that the choice of factor selection methods was far more effective than the selection of contributing factors in improving the prediction accuracy of landslide susceptibility.",

keywords = "Contributing factors, Deep learning, Factor selection, Landslide, Landslide susceptibility, Machine learning, Prediction",

author = "Cheng Chen and Lei Fan",

note = "Funding Information: This research was funded by Xi{\textquoteright}an Jiaotong-Liverpool University key Program Special Fund under grant number KSF-E-40 and Research Enhancement Fund under grant number REF-21-01-003. Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.",

year = "2023",

month = sep,

day = "13",

doi = "10.1007/s00477-023-02556-4",

language = "English",

journal = "Stochastic Environmental Research and Risk Assessment",

issn = "1436-3240",

}

TY - JOUR

T1 - Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models

AU - Chen, Cheng

AU - Fan, Lei

N1 - Funding Information: This research was funded by Xi’an Jiaotong-Liverpool University key Program Special Fund under grant number KSF-E-40 and Research Enhancement Fund under grant number REF-21-01-003. Publisher Copyright: © 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.

PY - 2023/9/13

Y1 - 2023/9/13

N2 - Landslides are a common natural disaster that can cause casualties, property safety threats and economic losses. Therefore, it is important to understand or predict the probability of landslide occurrence at potentially risky sites. A commonly used means is to carry out a landslide susceptibility assessment based on a landslide inventory and a set of landslide contributing factors. This can be readily achieved using machine learning (ML) models such as logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (Xgboost), or deep learning (DL) models such as convolutional neural network (CNN) and long short time memory (LSTM). As the input data for these models, landslide contributing factors have varying influences on landslide occurrence. Therefore, it is logically feasible to select more important contributing factors and eliminate less relevant ones, with the aim of increasing the prediction accuracy of these models. However, selecting more important factors is still a challenging task and there is no generally accepted method. Furthermore, the effects of factor selection using various methods on the prediction accuracy of ML and DL models are unclear. In this study, the impact of the selection of contributing factors on the accuracy of landslide susceptibility predictions using ML and DL models was investigated. Four methods for selecting contributing factors were considered for all the aforementioned ML and DL models, which included Information Gain Ratio (IGR), Recursive Feature Elimination (RFE), Particle Swarm Optimization (PSO), Least Absolute Shrinkage and Selection Operators (LASSO) and Harris Hawk Optimization (HHO). In addition, autoencoder-based factor selection methods for DL models were also investigated. To assess their performances, an exhaustive approach was adopted, testing all possible selection cases of contributing factors, the results of which served as the benchmark. The results confirmed that using more important contributing factors improved the prediction accuracy of the ML models considered. However, it was interesting to find that the selection of contributing factors using IGR and RFE reduced the predictive accuracy of the DL models. For the DL models, using an autoencoder architecture improved their prediction performance. The study also found that the choice of factor selection methods was far more effective than the selection of contributing factors in improving the prediction accuracy of landslide susceptibility.

AB - Landslides are a common natural disaster that can cause casualties, property safety threats and economic losses. Therefore, it is important to understand or predict the probability of landslide occurrence at potentially risky sites. A commonly used means is to carry out a landslide susceptibility assessment based on a landslide inventory and a set of landslide contributing factors. This can be readily achieved using machine learning (ML) models such as logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (Xgboost), or deep learning (DL) models such as convolutional neural network (CNN) and long short time memory (LSTM). As the input data for these models, landslide contributing factors have varying influences on landslide occurrence. Therefore, it is logically feasible to select more important contributing factors and eliminate less relevant ones, with the aim of increasing the prediction accuracy of these models. However, selecting more important factors is still a challenging task and there is no generally accepted method. Furthermore, the effects of factor selection using various methods on the prediction accuracy of ML and DL models are unclear. In this study, the impact of the selection of contributing factors on the accuracy of landslide susceptibility predictions using ML and DL models was investigated. Four methods for selecting contributing factors were considered for all the aforementioned ML and DL models, which included Information Gain Ratio (IGR), Recursive Feature Elimination (RFE), Particle Swarm Optimization (PSO), Least Absolute Shrinkage and Selection Operators (LASSO) and Harris Hawk Optimization (HHO). In addition, autoencoder-based factor selection methods for DL models were also investigated. To assess their performances, an exhaustive approach was adopted, testing all possible selection cases of contributing factors, the results of which served as the benchmark. The results confirmed that using more important contributing factors improved the prediction accuracy of the ML models considered. However, it was interesting to find that the selection of contributing factors using IGR and RFE reduced the predictive accuracy of the DL models. For the DL models, using an autoencoder architecture improved their prediction performance. The study also found that the choice of factor selection methods was far more effective than the selection of contributing factors in improving the prediction accuracy of landslide susceptibility.

KW - Contributing factors

KW - Deep learning

KW - Factor selection

KW - Landslide

KW - Landslide susceptibility

KW - Machine learning

KW - Prediction

UR - http://www.scopus.com/inward/record.url?scp=85171194474&partnerID=8YFLogxK

U2 - 10.1007/s00477-023-02556-4

DO - 10.1007/s00477-023-02556-4

M3 - Article

AN - SCOPUS:85171194474

SN - 1436-3240

JO - Stochastic Environmental Research and Risk Assessment

JF - Stochastic Environmental Research and Risk Assessment

ER -

Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this