• 中国出版政府奖提名奖

    中国百强科技报刊

    湖北出版政府奖

    中国高校百佳科技期刊

    中国最美期刊

    留言板

    尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

    姓名
    邮箱
    手机号码
    标题
    留言内容
    验证码

    随机森林与GIS的泥石流易发性及可靠性

    张书豪 吴光

    张书豪, 吴光, 2019. 随机森林与GIS的泥石流易发性及可靠性. 地球科学, 44(9): 3115-3134. doi: 10.3799/dqkx.2019.081
    引用本文: 张书豪, 吴光, 2019. 随机森林与GIS的泥石流易发性及可靠性. 地球科学, 44(9): 3115-3134. doi: 10.3799/dqkx.2019.081
    Zhang Shuhao, Wu Guang, 2019. Debris Flow Susceptibility and Its Reliability Based on Random Forest and GIS. Earth Science, 44(9): 3115-3134. doi: 10.3799/dqkx.2019.081
    Citation: Zhang Shuhao, Wu Guang, 2019. Debris Flow Susceptibility and Its Reliability Based on Random Forest and GIS. Earth Science, 44(9): 3115-3134. doi: 10.3799/dqkx.2019.081

    随机森林与GIS的泥石流易发性及可靠性

    doi: 10.3799/dqkx.2019.081
    基金项目: 

    中国铁路总公司科技开发计划 2010G004-I

    详细信息
      作者简介:

      张书豪(1990-), 男, 博士研究生, 主要从事斜坡地质灾害危险评价和稳定性分析

      通讯作者:

      吴光, E-mail:962444613@qq.com

    • 中图分类号: P642.23

    Debris Flow Susceptibility and Its Reliability Based on Random Forest and GIS

    • 摘要: 目前基于GIS的泥石流易发性(简称DFS)评价模型中,统计类型模型的因子须保证独立性,且权重受区间划分控制;线性机器学习难以处理非线性问题、而常用非线性模型调试效率低.鉴于随机森林(RF)能有效克服常用模型的诸多不足,且在DFS评价中的应用极少,首先展开基于RF的DFS评价,采用线性、RBF支持向量机、二次判别分析、RF等经贝叶斯优化的模型和26种泥石流影响因子;然后,分别以RF的相对权重排序和蒙特卡洛方法研究因子组合和建模样本变化下DFS评价的可靠性.结果表明:RF不易发和较易发区中有21个因子可指示泥石流孕育环境差异;RF的相对权重排序能有效确定易发模型的局部最优因子组合;随机样本划分导致的评价不确定性在中易发区最大,应通过提高建模样本比例和改善模型降低;RF的预测能力指标AUC为0.86、全局预测精度为0.79、F1分数为0.66、brier分数为0.14,以及它们的可靠度最优,可作为DFS定量评估的优先选择.

       

    • 图  1  研究区背景和泥石流灾害点

      Fig.  1.  The setting of the study area and the sites of debris-flow

      图  2  泥石流发育概率与发育密度匹配度

      Fig.  2.  The match degree between the probability and the density of debris-flow occurrence

      图  3  泥石流个数占比分布

      Fig.  3.  The distribution of proportions of debris-flows

      图  4  4种模型的泥石流易发性GIS制图

      Fig.  4.  The GIS maps of debris flow susceptibility of four models

      图  5  两类易发区间下影响因子的分布差异

      Fig.  5.  The difference in the distribution of each conditioning factor in the two susceptibility zones

      图  6  因子组合与预测能力关系

      图a和c为方案1,图b和d为方案2

      Fig.  6.  The correlation between the combination of debris-flow conditioning factors and the prediction performance of models

      图  7  平均泥石流易发性制图

      Fig.  7.  The GIS maps of the mean debris-flow susceptibility

      图  8  预测能力指标分布

      a. LSVM(a=1.43, c=91, s=0.825, μ=[0.823 01,0.823 79]),RBF-SVM(a=6.08, c=90.62, s=0.82, μ=[0.830 53,0.830 92]),QDA(a=2.41, c=490, s=0.79, μ=[0.816 85,0.817 01]),RF(a=8.09, c=131, s=0.85, μ=[0.860 10,0.860 36]);b. LSVM(a=216.9, c=25, s=0.16, μ=[0.170 17,0.170 29]),RBF-SVM(a=28.1, c=28, s=0.15, μ=[0.155 27,0.155 41]),QDA(a=139.7, c=30, s=0.16, μ=[0.174 17,0.174 28],RF(a=18.7, c=38, s=0.14, μ=[0.140 63, 0.140 74]).各模型后括号内为对应分布的参数,ac为分布形状参数,s代表比例参数,位置参数均为0,μ代表平均值95%的置信区间

      Fig.  8.  The distribution of indices of prediction performance

      图  9  不同建模样本比例下AUC和brier分数分布

      Fig.  9.  The distribution of AUC and brier score in different proportions of building samples

      表  1  影响因子汇总

      Table  1.   The summary of impact factors

      表  2  模型的混淆矩阵

      Table  2.   Confusion matrices of 4 models

      LSVM(线性支持向量机) 预测值
      非泥石流 泥石流
      真实值 非泥石流 915 135
      泥石流 202 318
      RBF-SVM(RBF支持向量机) 预测值
      非泥石流 泥石流
      真实值 非泥石流 979 71
      泥石流 285 235
      QDA(二次判别分析) 预测值
      非泥石流 泥石流
      真实值 非泥石流 835 215
      泥石流 162 358
      RF(随机森林) 预测值
      非泥石流 泥石流
      真实值 非泥石流 931 119
      泥石流 204 316
      下载: 导出CSV

      表  3  模型分类预测能力

      Table  3.   Classification performance of models

      易发性模型 全局预测精度(%) 泥石流准确率(%) 泥石流查全率(%) F1分数(%) AUC(%)
      LSVM 78.54 70.20 61.15 65.36 81.4
      RBF-SVM 77.32 76.80 45.19 56.90 82.8
      QDA 75.99 62.48 68.85 65.51 81.7
      RF 79.43 72.64 60.77 66.18 85.9
      完全随机 50.00 33.00 50% 39.75 50.0
      注:全局预测精度=正确分类单元个数/单元总个数,泥石流准确率=预测正确的泥石流单元数/总共预测为泥石流的单元数,泥石流查 全率=预测正确的泥石流单元数/实际泥石流单元总数,F1=2×泥石流准确率×泥石流查全率/(泥石流准确率+泥石流查全率).
      下载: 导出CSV

      表  4  各模型局部最优因子组合

      Table  4.   local optimal combination of conditioning factors in each model

      模型 因子组合 AUC提升 Brier分数降低
      LSVM 相对权重最大的1~11号因子 1.8% 0.7%
      RBF-SVM 相对权重最大的1~21号因子 1.0% 1.4%
      QDA 相对权重最大的1~12号因子 0.4% 7.0%
      RF 相对权重最大的1~12号因子 0.4% 1.7%
      下载: 导出CSV

      表  5  2 000次易发性评价指标均值

      Table  5.   The mean evaluation indices of 2 000 susceptibility assessments

      易发性模型 全局精度(%) 泥石流准确率(%) 泥石流查全率(%) F1分数(%) AUC(%) Brier分数
      LSVM 78.10 69.63 60.20 64.57 82.3 0.176
      RBF-SVM 76.80 75.17 44.77 56.09 83.1 0.155
      QDA 76.11 62.65 69.00 65.67 81.7 0.174
      RF 79.30 72.86 59.76 65.66 86.0 0.140
      注:样本数量2 000下,各指标均值的95%置信区间大小已精确到小数点后4位,有很高的确定性,足够模型使用和相互之间的对比,故该表中不再以置信区间形式给出,而直接给出均值.
      下载: 导出CSV
    • [1] Agterberg, F. P., Cheng, Q. M., 2002. Conditional Independence Test for Weights-of-Evidence Modeling. Natural Resources Research, 11(4):249-255. https://doi.org/10.1023/A:1021193827501
      [2] Alin, A., 2010. Multicollinearity. Wiley Interdisciplinary Reviews:Computational Statistics, 2(3):370-374. https://doi.org/10.1002/wics.84
      [3] Birolini, A., 2017. Reliability Engineering:Theory and Practice. Springer, Heidelberg.
      [4] Breiman, L., 2001. Random Forests. Machine learning, 45(1):5-32. https://doi.org/10.1023/A:1010933404324
      [5] CDATA[Brier, G. W., 1950. Verification of Forecasts Expressed in Terms of Probability. Monthly Weather Review, 78(1):1-3. https://doi.org/10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2 doi: 10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
      [6] Chen, C. Y., Yu, F. C., 2011. Morphometric Analysis of Debris Flows and Their Source Areas Using GIS. Geomorphology, 129(3-4):387-397. https://doi.org/10.1016/j.geomorph.2011.03.002
      [7] Chen, J., Li, Y., Zhou, W., et al., 2017. Debris-Flow Susceptibility Assessment Model and Its Application in Semiarid Mountainous Areas of the Southeastern Tibetan Plateau. Natural Hazards Review, 18(2):05016005.https://doi.org/10.1061/(asce)nh.1527-6996.0000229 doi: 10.1061/(ASCE)NH.1527-6996.0000229
      [8] Chevalier, G. G., Medina, V., Hürlimann, M., et al., 2013.Debris-Flow Susceptibility Analysis Using Fluvio-Morphological Parameters and Data Mining:Application to the Central-Eastern Pyrenees. Natural Hazards, 67(2):213-238. https://doi.org/10.1007/s11069-013-0568-3
      [9] Cortes, C., Vapnik, V., 1995. Support-Vector Networks.Machine Learning, 20(3):273-297. https://doi.org/10.1007/bf00994018 http://d.old.wanfangdata.com.cn/Periodical/hwyhmb200803006
      [10] Devkota, K. C., Regmi, A. D., Pourghasemi, H. R., et al., 2013. Landslide Susceptibility Mapping Using Certainty Factor, Index of Entropy and Logistic Regression Models in GIS and Their Comparison at Mugling-Narayanghat Road Section in Nepal Himalaya. Natural Hazards, 65(1):135-165. https://doi.org/10.1007/s11069-012-0347-6
      [11] Eker, A. M., Dikmen, M., Cambazoğlu, S., et al., 2015.Evaluation and Comparison of Landslide Susceptibility Mapping Methods:A Case Study for the Ulus District, Bartın, Northern Turkey. International Journal of Geographical Information Science, 29(1):132-158. https://doi.org/10.1080/13658816.2014.953164
      [12] Fernández-Delgado, M., Cernadas, E., Barro, S., et al., 2014. Do We Need Hundreds of Classifiers to Solve Real World Classification Problems? The Journal of Machine Learning Research, 15(1):3133-3181. http://connection.ebscohost.com/c/articles/99397983/do-we-need-hundreds-classifiers-solve-real-world-classification-problems
      [13] Frattini, P., Crosta, G., Carrara, A., 2010. Techniques for Evaluating the Performance of Landslide Susceptibility Models. Engineering Geology, 111(1-4):62-72. https://doi.org/10.1016/j.enggeo.2009.12.004
      [14] Guzzetti, F., Carrara, A., Cardinali, M., et al., 1999. Landslide Hazard Evaluation:A Review of Current Techniques and Their Application in a Multi-Scale Study, Central Italy. Geomorphology, 31(1-4):181-216.https://doi.org/10.1016/s0169-555x(99)00078-1 doi: 10.1016/S0169-555X(99)00078-1
      [15] Henrique, G.M., Ronald, L.B., Robert, R.W., et al., 2013.Effect of Topographic Characteristics on Compound Topographic Index for Identification of Gully Channel Initiation Locations. Transactions of the ASABE, 56(2):523-537. https://doi.org/10.13031/2013.42673
      [16] Hu, K.H., Cui, P., Han, Y.S., et al., 2012.Susceptibility Mapping of Landslides and Debris Flows in 2008 Wenchuan Earthquake by Using Cluster Analysis and Maximum Likelihood Classification Methods. Science of Soil and Water Conservation, 10(1):12-18 (in Chinese with English abstract). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=zgstbckx201201003
      [17] Huang, F.M., Wang, Y., Dong, Z.L., et al., 2019. Regional Landslide Susceptibility Mapping Based on Grey Relational Degree Model. Earth Science, 44(2):664-676(in Chinese with English abstract). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dqkx201902024
      [18] Huang, F. M., Yin, K. L., Yang, B. B., et al., 2018. Step-Like Displacement Prediction of Landslide Based on Time Series Decomposition and Multivariate Chaotic Model. Earth Science, 43(3):887-898 (in Chinese with English abstract). http://d.old.wanfangdata.com.cn/Periodical/dqkx201803017
      [19] Huang, R.Q., Qi, S.W., 2017. Engineering Geology:Review and Prospect of Past Ten Years in China. Journal of Engineering Geology, 25(2):257-276 (in Chinese with English abstract). http://www.en.cnki.com.cn/Article_en/CJFDTotal-GCDZ201702001.htm
      [20] Hungr, O., Evans, S. G., Bovis, M. J., et al., 2001. A Review of the Classification of Landslides of the Flow Type. Environmental & Engineering Geoscience, 7(3):221-238. https://doi.org/10.2113/gseegeosci.7.3.221
      [21] Hungr, O., Leroueil, S., Picarelli, L., 2014. The Varnes Classification of Landslide Types, an Update. Landslides, 11(2):167-194. https://doi.org/10.1007/s10346-013-0436-y
      [22] Kritikos, T., Davies, T., 2015. Assessment of Rainfall-Generated Shallow Landslide/Debris-Flow Susceptibility and Runout Using a GIS-Based Approach:Application to Western Southern Alps of New Zealand. Landslides, 12(6):1051-1075. https://doi.org/10.1007/s10346-014-0533-6
      [23] Kursa, M. B., Rudnicki, W. R., 2010. Feature Selection with the Boruta Package. Journal of Statistical Software, 36(11):1-13. https://doi.org/10.18637/jss.v036.i11 http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_5b68996e67abc5d5ed2df21ba5a5de9d
      [24] Li, F., Mei, H.B., Wang, W.S., et al., 2017. Rainfall-Induced Meteorological Early Warning of Geo-Hazards Model:Application to the Monitoring Demonstration Area in Honghe Prefecture, Yunnan Province.Earth Science, 42(9):1637-1646 (in Chinese with English abstract). http://en.cnki.com.cn/Article_en/CJFDTOTAL-DQKX201709016.htm
      [25] Li, M. M., Wu, B. F., Yan, C. Z., et al., 2004. Estimation of Vegetation Fraction in the Upper Basin of Miyun Reservoir by Remote Sensing. Resources Science, 26(4):153-159 (in Chinese with English abstract). http://en.cnki.com.cn/Article_en/CJFDTOTAL-ZRZY200404022.htm
      [26] Niculescu-Mizil, A., Caruana, R., 2005. Predicting Good Probabilities with Supervised Learning. In: Stefan, W., ed., Proceedings of the 22nd International Conference on Machine Learning. Association for Computing Machinery, Bonn, 625-632.
      [27] Oh, H. J., Lee, S., 2017. Shallow Landslide Susceptibility Modeling Using the Data Mining Models Artificial Neural Network and Boosted Tree. Applied Sciences, 7(10):1-14. https://doi.org/10.3390/app7101000
      [28] Pedregosa, F., Varoquaux, G., Gramfort, A., et al., 2011.Scikit-Learn:Machine Learning in Python. Journal of Machine Learning Research, 12(10):2825-2830. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1309.0238
      [29] Pradhan, B., 2013. A Comparative Study on the Predictive Ability of the Decision Tree, Support Vector Machine and Neuro-Fuzzy Models in Landslide Susceptibility Mapping Using GIS. Computers & Geosciences, 51:350-365. https://doi.org/10.1016/j.cageo.2012.08.023 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=c652c43324bfb4b73bd38d99709fd144
      [30] Reichenbach, P., Rossi, M., Malamud, B. D., et al., 2018.A Review of Statistically-Based Landslide Susceptibility Models. Earth-Science Reviews, 180:60-91. https://doi.org/10.1016/j.earscirev.2018.03.001
      [31] Rossi, M., Guzzetti, F., Reichenbach, P., et al., 2010. Optimal Landslide Susceptibility Zonation Based on Multiple Forecasts. Geomorphology, 114(3):129-142. https://doi.org/10.1016/j.geomorph.2009.06.020
      [32] Shortliffe, E. H., Buchanan, B. G., 1975. A Model of Inexact Reasoning in Medicine. Mathematical Biosciences, 23(3-4):351-379. https://doi.org/10.1016/0025-5564(75)90047-4
      [33] Sodnik, J., Mikoš, M., 2006. Estimation of Magnitudes of Debris Flows in Selected Torrential Watersheds in Slovenia. Acta Geographica Slovenica, 46(1):93-123.https://doi.org/10.3986/ags46104 doi: 10.3986/AGS46104
      [34] Tang, G. A., Yang, X., 2012. Experimental Course on Spatial Analysis of Geographic Information System Arcgis.Science Press, Beijing(in Chinese).
      [35] Yue, X.L., Huang, M., Xu, Q.Y., et al., 2015. The Susceptibility Assessment of Debris Flow in Karst Region of Guizhou Province. Journal of Geo-Information Science, 17(11):1395-1403 (in Chinese with English abstract). http://d.old.wanfangdata.com.cn/Periodical/dqxxkx201511015
      [36] Zhang, S.H., Wu, G., Zhang, Q., et al., 2018. Debris-Flow Susceptibility Assessment Using the Characteristic Factors of a Catchment. Hydrogeology & Engineering Geology, 45(2):142-149 (in Chinese with English abstract). http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=swdzgcdz201802022
      [37] Zhao, H., Song, E. X., 2011. Improved Information Value Model and Its Application in the Spatial Prediction of Landslides. Journal of Civil, Architectural & Environmental Engineering, 33(3):38-44, 51 (in Chinese with English abstract). http://en.cnki.com.cn/Article_en/CJFDTOTAL-JIAN201103008.htm
      [38] Zio, E., 2013. The Monte Carlo Simulation Method for System Reliability and Risk Analysis. Springer, London. doi: 10.1007%2F978-1-4471-4588-2
      [39] 胡凯衡, 崔鹏, 韩用顺, 等, 2012.基于聚类和最大似然法的汶川灾区泥石流滑坡易发性评价.中国水土保持科学, 10(1):12-18. doi: 10.3969/j.issn.1672-3007.2012.01.003
      [40] 黄发明, 汪洋, 董志良, 等, 2019.基于灰色关联度模型的区域滑坡敏感性评价.地球科学, 44(2):664-676. http://d.old.wanfangdata.com.cn/Periodical/dqkx201902024
      [41] 黄发明, 殷坤龙, 杨背背, 等, 2018.基于时间序列分解和多变量混沌模型的滑坡阶跃式位移预测.地球科学, 43(3):887-898. http://d.old.wanfangdata.com.cn/Periodical/dqkx201803017
      [42] 黄润秋, 祁生文, 2017.工程地质:十年回顾与展望.工程地质学报, 25(2):257-276. http://d.old.wanfangdata.com.cn/Periodical/gcdzxb201501001
      [43] 李芳, 梅红波, 王伟森, 等, 2017.降雨诱发的地质灾害气象风险预警模型:以云南省红河州监测示范区为例.地球科学, 42(9):1637-1646. http://d.old.wanfangdata.com.cn/Periodical/dqkx201709016
      [44] 李苗苗, 吴炳方, 颜长珍, 等, 2004.密云水库上游植被覆盖度的遥感估算.资源科学, 26(4):153-159. doi: 10.3321/j.issn:1007-7588.2004.04.022
      [45] 汤国安, 杨昕, 2012.Arcgis地理信息系统空间分析实验教程.北京:科学出版社.
      [46] 岳溪柳, 黄玫, 徐庆勇, 等, 2015.贵州省喀斯特地区泥石流灾害易发性评价.地球信息科学学报, 17(11):1395-1403. http://d.old.wanfangdata.com.cn/Periodical/dqxxkx201511015
      [47] 张书豪, 吴光, 张乔, 等, 2018.基于子流域特征的泥石流易发性评价.水文地质工程地质, 45(2):142-149. http://d.old.wanfangdata.com.cn/Periodical/swdzgcdz201802022
      [48] 赵衡, 宋二祥, 2011.滑坡空间预测中信息量模型的改进及应用.土木建筑与环境工程, 33(3):38-44, 51. http://d.old.wanfangdata.com.cn/Periodical/cqjzdxxb201103007
    • 加载中
    图(9) / 表(5)
    计量
    • 文章访问数:  4197
    • HTML全文浏览量:  1941
    • PDF下载量:  74
    • 被引次数: 0
    出版历程
    • 收稿日期:  2019-01-28
    • 刊出日期:  2019-09-15

    目录

      /

      返回文章
      返回