基于优化随机森林模型的滑坡易发性评价

刘坚, 李树林, 陈涛

刘坚, 李树林, 陈涛. 基于优化随机森林模型的滑坡易发性评价[J]. 武汉大学学报 ( 信息科学版), 2018, 43(7): 1085-1091. DOI: 10.13203/j.whugis20160515
引用本文: 刘坚, 李树林, 陈涛. 基于优化随机森林模型的滑坡易发性评价[J]. 武汉大学学报 ( 信息科学版), 2018, 43(7): 1085-1091. DOI: 10.13203/j.whugis20160515
LIU Jian, LI Shulin, CHEN Tao. Landslide Susceptibility Assesment Based on Optimized Random Forest Model[J]. Geomatics and Information Science of Wuhan University, 2018, 43(7): 1085-1091. DOI: 10.13203/j.whugis20160515
Citation: LIU Jian, LI Shulin, CHEN Tao. Landslide Susceptibility Assesment Based on Optimized Random Forest Model[J]. Geomatics and Information Science of Wuhan University, 2018, 43(7): 1085-1091. DOI: 10.13203/j.whugis20160515

基于优化随机森林模型的滑坡易发性评价

基金项目: 

国家高技术研究发展计划(863计划) 2012AA121303

详细信息
    作者简介:

    刘坚, 博士生, 工程师, 现从事云计算与地质灾害评估应用研究。linefanliu@163.com

    通讯作者:

    李树林, 硕士生。lishulincug@gmail.com

  • 中图分类号: P694;P208

Landslide Susceptibility Assesment Based on Optimized Random Forest Model

Funds: 

The National High Technology Research and Development Program of China(863 Program) 2012AA121303

More Information
    Author Bio:

    LIU Jian, PhD candidate, engineer, specializes in cloud computing and geological disaster assessment. E-mail:linefanliu@163.com

    Corresponding author:

    LI Shulin, postgraduate. E-mail: lishulincug@gmail.com

  • 摘要: 以三峡库区沙镇溪镇-泄滩乡为研究区,探索基于最短描述长度原则的信息增益法对滑坡连续型因子进行离散的效果,计算皮尔森系数去除高相关因子。利用信息量法预测的极低、低易发区随机抽取非滑坡样本点。通过迭代计算袋外误差估计确定较优的随机特征及其数目,将优化后的随机森林对研究区滑坡进行易发性评价,并与逻辑回归等方法进行比较。绘制各算法预测结果的接收灵敏度曲线,其中优化后的随机森林预测结果的曲线下面积较高,达91.8%,表明优化随机森林模型在滑坡易发性评价中具有较高的预测能力。
    Abstract: The research area is located in Shazhenxi town and Xietan town of Three Gorges reservoir area in this paper. In order to obtain better results that discrete the continuous factors of landslide, entropy based on minimal description length principle(Ent-MDLP) method is used. To avoid the influence of correlation between factors, we calculate the Pearson correlation coefficient to remove high correlation factor. In order to obtain more accurate non-landslide sample points, the non-landslide sample points are randomly selected from the very low and low susceptible regions predicted by the entropy method. For the optimized random forests model, the optimal random features and its number are determined by iterative calculation of out-of-bag error estimation. Then the optimized random forest is evaluated for the landslide of the study area, and the landslide susceptibility level is divided. The model is compared with the methods of logistic regression, support vector machine and non-optimized random forest. The accuracy of each model is evaluated by plotting the receiver sensitivity curve of each algorithm. The optimized random forest's area is the highest, which the area under the curve is 91.8%. These show that the random forest model is optimized with more high-predictive power in landslide-prone assessment.
  • 图  1   随机森林算法示意图

    Figure  1.   Diagram of Random Forest Algorithm

    图  2   研究区地理位置及灾害分布图

    Figure  2.   Location of the Study Area and Distribution Map of Landslide Disaster

    图  3   各因子信息量分布图

    Figure  3.   Information Distribution of Factors

    图  4   主要因子信息量分布图

    Figure  4.   Information Distribution of Main Factors

    图  5   各因子重要性分布图

    Figure  5.   Importance Distribution of All Factors

    图  6   不同随机特征数下RF的OOB误差分布图

    Figure  6.   OOB Error Distribution of Random Forest with Different Numbers of Random Features

    图  7   各种模型预测结果的ROC曲线

    Figure  7.   ROC Curves of Various Models' Prediction Results

    图  8   滑坡易发性分布图

    Figure  8.   Distribution of Landslide Susceptibility

    表  1   实验数据分类及特性表

    Table  1   Classification of Experimental Data and Characteristics Table

    数据类型 空间分辨率 数据用途描述 时间
    Sentinel-2A 可见光与全色10 m, 多光谱20 m 对已有道路等数据校正补充 2016-02-16
    Landsat 8 全色15 m,多光谱30 m 提取土地利用、NDVI、NDWI等 2013-09-15
    DEM 30 m 提取高程、坡度等地形因子
    地质图 1:50 000 提取地层岩性、断层等
    下载: 导出CSV

    表  2   危险性分区结果分析表

    Table  2   Analysis Table of Risk Zoning Result

    易发区 面积A/km2 滑坡面积B/km2 占总滑坡面积比例/% 危险性B/A
    极低易发区 80.818 2 0.014 4 0.13 0.000 2
    低易发区 20.517 3 0.050 4 0.47 0.002 5
    中易发区 20.303 1 0.456 3 4.28 0.022 5
    高易发区 19.963 8 2.065 5 19.36 0.103 5
    极高易发区 20.594 7 8.081 1 75.75 0.392 4
    下载: 导出CSV
  • [1] 刘阳. 延长县滑坡地质灾害风险评估和管理研究[D]. 西安: 长安大学, 2009

    Liu Yang. Extension of the County Landslide Disaster Risk Assessment and Management Research[D]. Xi'an: Chang'an University, 2009

    [2] 许冲, 戴福初, 姚鑫, 等. GIS支持下基于层次分析法的汶川地震区滑坡易发性评价[J].岩石力学与工程学报, 2009, 28(a02):3978-3985 http://d.old.wanfangdata.com.cn/Periodical/yslxygcxb2009z2100

    Xu Chong, Dai Fuchu, Yao Xin, et al. GIS-Based Landslide Susceptibility Assessment Using Analytical Hierarchy Process in Wenchuan Earthquake Region[J]. Chinese Journal of Rock Mechanics and Engineering, 2009, 28(a02):3978-3985 http://d.old.wanfangdata.com.cn/Periodical/yslxygcxb2009z2100

    [3] 罗向奎, 付旭辉.基于极限平衡法的杨家坝滑坡稳定性分析[J].山西建筑, 2009, 35(6):108-109 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=shanxjz200906066

    Luo Xiangkui, Fu Xuhui. Landslide Stability Ana-lysis of Yangjiaba Based Upon Limit Equilibrium Method[J].Shanxi Architecture, 2009, 35(6):108-109 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=shanxjz200906066

    [4] 王卫东, 陈燕平, 钟晟.应用CF和Logistic回归模型编制滑坡危险性区划图[J].中南大学学报(自然科学版), 2009, 40(4):1127-1132 https://www.wenkuxiazai.com/doc/74f9d3462b160b4e767fcfb7-3.html

    Wang Weidong, Chen Yanping, Zhong Sheng. Landslides Susceptibility Mapped with CF and Logistic Regression Model[J].Journal of Central South University(Science and Technology), 2009, 40(4):1127-1132 https://www.wenkuxiazai.com/doc/74f9d3462b160b4e767fcfb7-3.html

    [5] 王佳佳, 殷坤龙, 肖莉丽.基于GIS和信息量的滑坡灾害易发性评价——以三峡库区万州区为例[J].岩石力学与工程学报, 2014, 33(4):797-808 http://dqxxkx.cn/CN/abstract/abstract40303.shtml

    Wang Jiajia, Yin Kunlong, Xiao Lili. Landslide Susceptibility Assessment Based on GIS and Weighted Information Value:A Case Study of Wanzhou District, Three Gorges Reservoir[J]. Chinese Journal of Rock Mechanics and Engineering, 2014, 33(4):797-808 http://dqxxkx.cn/CN/abstract/abstract40303.shtml

    [6] 牛瑞卿, 彭令, 叶润青, 等.基于粗糙集的支持向量机滑坡易发性评价[J].吉林大学学报(地球科学版), 2012, 42(2):430-439 http://www.cqvip.com/QK/91256B/201202/41619273.html

    Niu Ruiqing, Peng Ling, Ye Runqing, et al. Landslide Susceptibility Assessment Based on Rough Sets and Support Vector Machine[J]. Journal of Jilin University(Earth Science Edition), 2012, 42(2):430-439 http://www.cqvip.com/QK/91256B/201202/41619273.html

    [7] 武雪玲, 任福, 牛瑞卿, 等.斜坡单元支持下的滑坡易发性评价支持向量机模型[J].武汉大学学报·信息科学版, 2013, 38(12):1499-1503 http://www.cnki.com.cn/Article/CJFDTotal-YNSK201603015.htm

    Wu Xueling, Ren Fu, Niu Ruiqing, et al. Landslide Spatial Prediction Based on Slope Units and Support Vector Machines[J]. Geomatics and Information Science of Wuhan University, 2013, 38(12):1499-1503 http://www.cnki.com.cn/Article/CJFDTotal-YNSK201603015.htm

    [8]

    Pradhan B. Manifestation of an Advanced Fuzzy Logic Model Coupled with Geo-information Techniques to Landslide Susceptibility Mapping and Their Comparison with Logistic Regression Modelling[J]. Environmental and Ecological Statistics, 2011, 18(3):471-493 doi: 10.1007/s10651-010-0147-7

    [9] 曹正凤. 随机森林算法优化研究[D]. 北京: 首都经济贸易大学, 2014

    Cao Zhengfeng. Study on Optimization of Random Forests Algorithm[D]. Beijing: Capital University of Economics and Business, 2014

    [10]

    Breiman L. Random Forests[J]. Machine Lear-ning, 2001, 45(1):5-32 doi: 10.1023/A:1010933404324

    [11] 方匡南, 吴见彬, 朱建平, 等.随机森林方法研究综述[J].统计与信息论坛, 2011, 26(3):32-38 http://www.cnki.com.cn/Article/CJFDTOTAL-TJLT201103007.htm

    Fang Kuangnan, Wu Jianbin, Zhu Jianping, et al. A Review of Technologies on Random Forests[J]. Statistics & Information Forum, 2011, 26(3):32-38 http://www.cnki.com.cn/Article/CJFDTOTAL-TJLT201103007.htm

    [12] 李贞贵. 随机森林改进的若干研究[D]. 厦门: 厦门大学, 2013

    Li Zhengui. Several Research on Random Forest Improvement[D]. Xiamen: Xiamen University, 2013

    [13] 董师师, 黄哲学.随机森林理论浅析[J].集成技术, 2013, 2(1):1-7 http://cdmd.cnki.com.cn/Article/CDMD-10559-1016734003.htm

    Dong Shishi, Huang Zhexue. A Brief Theoretical Overview of Random Forests[J].Journal of Integration Technology, 2013, 2(1):1-7 http://cdmd.cnki.com.cn/Article/CDMD-10559-1016734003.htm

    [14] 安洲. 基于随机森林的硬盘故障预测算法的研究[D]. 天津: 南开大学, 2014

    An Zhou. Hard Drive Failure Prediction Based on Random Forest[D]. Tianjin: Nankai University, 2014

    [15] 彭令. 三峡库区滑坡灾害风险评估研究[D]. 武汉: 中国地质大学, 2013

    Peng Ling. Landslide Risk Assessment in the Three Gorges Reservoir[D]. Wuhan: China University of Geosciences, 2013

    [16] 田正国, 程温鸣, 卢书强, 等.三峡库区滑坡崩塌发育的控制与诱发因素分析[J].资源环境与工程, 2013, 27(1):50-55 http://www.cqvip.com/QK/82916A/201301/47948282.html

    Tian Zhengguo, Cheng Wenming, Lu Shuqiang, et al. Control and Triggering Factors Analysis of Landslides and Rockfalls in the Three Gorges Re-servoir Area[J]. Resources Environment & Engineering, 2013, 27(1):50-55 http://www.cqvip.com/QK/82916A/201301/47948282.html

图(8)  /  表(2)
计量
  • 文章访问数: 
  • HTML全文浏览量: 
  • PDF下载量: 
  • 被引次数: 0
出版历程
  • 收稿日期:  2017-10-18
  • 发布日期:  2018-07-04

目录

    /

    返回文章
    返回