Application of Machine Learning Classification Algorithm to Precipitation-induced Landslides Forecasting
-
摘要: 针对气象灾害预警业务中客观描述降雨型滑坡发生不确定性的实际需求,利用2014—2020年全国滑坡数据以及多源融合降水实况分析数据,通过样本构建、模型训练、参数优化以及预报输出等关键步骤构建基于机器学习分类算法的区域降雨诱发滑坡概率预报模型,探究不同类型机器学习分类算法识别诱发滑坡的降雨过程的可行性。结果表明:在算法评估中,线性判别分析算法准确率最高且泛化能力最好,其次为逻辑回归算法,再次为最邻近算法。在预报试验中,线性判别分析、逻辑回归以及最邻近等算法能够提取并学习降雨诱发滑坡的条件特征,对诱发滑坡的降雨过程有一定识别能力,最邻近算法和逻辑回归算法的概率预报高值区范围相对较大,易造成虚警结果,线性判别分析算法对局地降雨信息的提炼较好,但线性判别分析算法在非降雨中心区域输出低值概率预报的面积偏大。Abstract:
To address the practical needs of objectively describing the uncertainty of rainfall-based landslides and the existing problems of single warning indicators and subjective forecasting methods in the meteorological disaster early warning business, landslide disaster data from 2014 to 2020 and multi-source used precipitation analysis data are investigated to construct a regional rainfall-induced landslides probability forecasting model. Machine learning classification algorithms is implemented through key steps such as sample construction, model training, parameter optimization and forecast output to explore the feasibility of different types of algorithms in identifying landslides-causing rainfall processes. A training sample set construction method based on the positive samples, the negative samples are obtained by sampling under spatial-temporal limitation. The evaluation of different machine learning classification algorithms using the sample set shows that linear discriminant analysis algorithm has the highest accuracy(0.863) and the best generalization ability(area under the receiver operating characteristic curve is 0.886) without over-fitting problem, followed by the logistic regression algorithm and the K-nearest neighbor algorithm. In the probabilistic forecasting test for the cases of rainfall-induced landslides in 2021, all of three algorithms can extract and learn the conditional features and have certain ability to identify the rainfall processes which induce landslides. K-nearest neighbor algorithms and logistic regression algorithms have a relatively large range of probabilistic forecasting high value areas, which are prone to false alarm results. The probability forecast of the linear discriminant analysis algorithms is more convergent in the range of the high value area, and it can extract local rainfall information better, but it outputs unnecessary low-value probability forecasts in non-rainfall central area. The rainfall-induced landslides probability prediction model based on the machine learning classification algorithm comprehensively considers the coupling effect of the underlying surface factor and the rainfall factor, which is better than the commonly used critical threshold model that assumes the occurrence of landslides in the forecast area is only related to rainfall. The application results show that the machine learning classification algorithm model makes up for the shortcomings of existing forecasting models that are less likely to reflect the influence of the surface environment, so it is an important way to improve the performance of landslides forecasting and warning.
-
Key words:
- landslides;
- influence factors;
- machine learning;
- classification algorithms
-
表 1 模型算法测试结果
Table 1 Model algorithms test results
模型算法 ACC AUC 线性判别分析 0.863 0.886 最邻近 0.838 0.858 逻辑回归 0.840 0.879 随机森林 0.834 0.849 支持向量机 0.821 0.819 决策树 0.832 0.841 临界阈值 0.658 0.693 表 2 降雨诱发滑坡个例
Table 2 Cases of rainfall-induced landslides
编号 发生时间 发生位置 阈值模型预报 个例1 2021-05-22T04:00 福建省宁德市国道104线福安路段(27.0°N,119.7°E) 未发生滑坡 个例2 2021-07-26T10:00 浙江省绍兴市柯桥区平水镇下灶村(29.9°N,120.7°E) 发生滑坡 个例3 2021-08-29T14:00 重庆市开州区关面乡关面社区(31.6°N,108.9°E) 发生滑坡 个例4 2021-09-05T16:00 四川省巴中市通江县空山镇五福村(32.5°N,107.4°E) 发生滑坡 个例5 2021-10-05T23:00 山西省临汾市蒲县蒲城镇荆坡村(36.4°N,111.1°E) 未发生滑坡 -
[1] 魏丽, 陈双溪, 边小庚. 暴雨型滑坡灾害因素分析及预测试验研究. 应用气象学报, 2007, 18(5): 682-689. doi: 10.3969/j.issn.1001-7313.2007.05.013Wei L, Chen S X, Bian X G. Trial study on factors analysis and prediction of landslide hazard triggered by extreme heavy rainfall. J Appl Meteor Sci, 2007, 18(5): 682-689. doi: 10.3969/j.issn.1001-7313.2007.05.013 [2] 陈悦丽, 赵琳娜, 王英, 等. 降雨型地质灾害预报方法研究进展. 应用气象学报, 2019, 30(2): 142-153. doi: 10.11898/1001-7313.20190202Chen Y L, Zhao L N, Wang Y, et al. Review on forecast methods of rainfall-induced geo-hazards. J Appl Meteor Sci, 2019, 30(2): 142-153. doi: 10.11898/1001-7313.20190202 [3] 周雨, 刘志萍, 张国平. 鹰厦铁路降水诱发地质灾害概率预报模型及应用. 应用气象学报, 2015, 26(6): 743-749. doi: 10.11898/1001-7313.20150611Zhou Y, Liu Z P, Zhang G P. Probability forecasting model of geological disaster along the Yingxia railway induced by pre-cipitation with its application. J Appl Meteor Sci, 2015, 26(6): 743-749. doi: 10.11898/1001-7313.20150611 [4] Lumb P. Slope failures in Hong Kong. Q J Eng Geol Hydrogeol, 1975, 8(9): 31-65. [5] Brand E W, Premchitt J, Phillipson H B. Relationship Between Rainfall and Landslides in Hong Kong//Proc 4th Int Symposium Landslides, 1984: 377-384. [6] Brand E W. Predicting the Performance of Residual Soil Slopes//Proc 11th Int Conf on Soil Mechanics and Foundation Engineering, 1985: 2541-2578. [7] 刘艳辉, 唐灿, 李铁锋, 等. 地质灾害与降雨雨型的关系研究. 工程地质学报, 2009, 17(5): 656-661. doi: 10.3969/j.issn.1004-9665.2009.05.012Liu Y H, Tang C, Li T F, et al. Statistical relation between geo-hazards and rain type. J Eng Geol, 2009, 17(5): 656-661. doi: 10.3969/j.issn.1004-9665.2009.05.012 [8] 丁力, 彭九慧, 谭国明. 承德市地质灾害气象预报方法初探. 气象科技, 2006, 34(6): 750-753. doi: 10.3969/j.issn.1671-6345.2006.06.021Ding L, Peng J H, Tan G M. Methods for forecasting geological-meteorological disasters in Chengde. Meteor Sci Technol, 2006, 34(6): 750-753. doi: 10.3969/j.issn.1671-6345.2006.06.021 [9] 陈列, 王东法, 潘劲松, 等. 浙江省地质灾害气象预报模型研究. 热带气象学报, 2012, 28(5): 764-770. https://www.cnki.com.cn/Article/CJFDTOTAL-RDQX201205018.htmChen L, Wang D F, Pan J S, et al. On meteorological forecasting models for geological disasters in Zhejiang Province. J Trop Meteor, 2012, 28(5): 764-770. https://www.cnki.com.cn/Article/CJFDTOTAL-RDQX201205018.htm [10] 徐辉, 刘海知. 诱发滑坡的多尺度降雨特征. 山地学报, 2019, 37(6): 858-867. https://www.cnki.com.cn/Article/CJFDTOTAL-SDYA201906007.htmXu H, Liu H Z. Multi-scale rainfall characteristics of rainfall-induced landslides. Mountain Research, 2019, 37(6): 858-867. https://www.cnki.com.cn/Article/CJFDTOTAL-SDYA201906007.htm [11] 韦方强, 汤家法, 谢洪, 等. 区域和沟谷相结合的泥石流预报及其应用. 山地学报, 2004, 22(3): 321-325. doi: 10.3969/j.issn.1008-2786.2004.03.011Wei F Q, Tang J F, Xie H, et al. Debris flow forecast combined regions and valleys and its application. Mountain Research, 2004, 22(3): 321-325. doi: 10.3969/j.issn.1008-2786.2004.03.011 [12] 马志江, 陈汉林, 杨树锋. 基于支持向量机理论的滑坡灾害预测. 浙江大学学报(自然科学版), 2003, 30(5): 592-596. https://www.cnki.com.cn/Article/CJFDTOTAL-HZDX200305027.htmMa Z J, Chen H L, Yang S F. Prediction of landslide hazard based on support vector machine theory. J Zhejiang University(Sci Ed), 2003, 30(5): 592-596. https://www.cnki.com.cn/Article/CJFDTOTAL-HZDX200305027.htm [13] Dai F C, Lee C F. A Spatiotemporal probabilistic modeling of storm-induced shallow landsliding using aerial photographs and logistic regression. Earth Surf Proc Land, 2003, 28: 527-545. doi: 10.1002/esp.456 [14] 唐川, 朱静. 云南滑坡泥石流研究. 北京: 商务印书馆, 2003.Tang C, Zhu J. Landslides & Debris Flow in Yunnan. Beijing: China Commercial Press, 2003. [15] 丛威青, 潘懋, 李铁锋, 等. 降雨型泥石流临界雨量定量分析. 岩石力学与工程学报, 2006, 25(增刊Ι): 2808-2812. https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX2006S1031.htmCong W Q, Pan M, Li T F, et al. Quantitative analysis of critical rainfall-triggered debris flow. Chinese Journal of Rock Mechanics and Engineering, 2006, 25(SupplⅠ): 2808-2812. https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX2006S1031.htm [16] 胡娟, 闵颖, 李华宏, 等. 云南省山洪地质灾害气象预报预警方法研究. 灾害学, 2014, 29(1): 62-66. https://www.cnki.com.cn/Article/CJFDTOTAL-ZHXU201401012.htmHu J, Min Y, Li H H, et al. Meteorological early-warning research of mountain torrent and geologic hazard in Yunnan Province. J Catastrophology, 2014, 29(1): 62-66. https://www.cnki.com.cn/Article/CJFDTOTAL-ZHXU201401012.htm [17] Hou A Y, Kakar R K, Neeck S, et al. The global precipitation measurement mission. Bull Amer Meteor Soc, 2014, 95(5): 701-722. doi: 10.1175/BAMS-D-13-00164.1 [18] Lü H, Hou T, Horton R, et al. The streamflow estimation using the Xinanjiang rainfall runoff model and dual state-parameter estimation method. J Hydrol, 2013, 480(4): 102-114. [19] 安英玉, 金凤岭, 张云峰, 等. 地面雨滴谱观测的图像自动识别方法. 应用气象学报, 2008, 19(2): 188-193. doi: 10.3969/j.issn.1001-7313.2008.02.008An Y Y, Jin F L, Zhang Y F, et al. Automatic identification methods of ground raindrop spectrum observation and image. J Appl Meteor Sci, 2008, 19(2): 188-193. doi: 10.3969/j.issn.1001-7313.2008.02.008 [20] 钱建梅, 孙安来, 徐喆, 等. 风云气象卫星数据存档与服务系统. 应用气象学报, 2012, 23(3): 369-376. doi: 10.3969/j.issn.1001-7313.2012.03.014Qian J M, Sun A L, Xu Z, et al. Fengyun series meteorological satellite data archiving and service system. J Appl Meteor Sci, 2012, 23(3): 369-376. doi: 10.3969/j.issn.1001-7313.2012.03.014 [21] 陈明轩, 高峰, 孙娟珍, 等. 基于VDRAS的快速更新雷达四维变分分析系统. 应用气象学报, 2016, 27(3): 257-272. doi: 10.11898/1001-7313.20160301Chen M X, Gao F, Sun J Z, et al. An analysis system using rapid-updating 4-D variational radar data assimilation based on VDRAS. J Appl Meteor Sci, 2016, 27(3): 257-272. doi: 10.11898/1001-7313.20160301 [22] 智协飞, 赵忱. 基于集合成员订正的强降水多模式集成预报. 应用气象学报, 2020, 31(3): 303-314. doi: 10.11898/1001-7313.20200305Zhi X F, Zhao C. Heavy precipitation forecasts based on multi-model ensemble members. J Appl Meteor Sci, 2020, 31(3): 303-314. doi: 10.11898/1001-7313.20200305 [23] 危国飞, 刘会军, 吴启树, 等. 多模式降水分级最优化权重集成预报技术. 应用气象学报, 2020, 31(6): 668-680. doi: 10.11898/1001-7313.20200603Wei G F, Liu H J, Wu Q S, et al. Multi-model consensus forecasting technology with optimal weight for precipitation intensity levels. J Appl Meteor Sci, 2020, 31(6): 668-680. doi: 10.11898/1001-7313.20200603 [24] 宇婧婧, 沈艳, 潘旸, 等. 概率密度匹配法对中国区域卫星降水资料的改进. 应用气象学报, 2013, 24(5): 544-553. doi: 10.3969/j.issn.1001-7313.2013.05.004Yu J J, Shen Y, Pan Y, et al. Improvement of satellite-based precipitation estimates over China baesd on probability density function matching method. J Appl Meteor Sci, 2013, 24(5): 544-553. doi: 10.3969/j.issn.1001-7313.2013.05.004 [25] 伏永朋, 常宏, 李逵. 湖北省谷城县地质灾害易发程度分区评价. 地质调查与研究, 2007, 30(1): 70-78. doi: 10.3969/j.issn.1672-4135.2007.01.010Fu Y P, Chang H, Li K. Assessment of probable occurrence level of geological hazard subdistrict in Gucheng County, Hubei Province. Geol Survey Res, 2007, 30(1): 70-78. doi: 10.3969/j.issn.1672-4135.2007.01.010 [26] 胡金, 李波, 杨艳锋. GIS在云南鲁甸县地质灾害易发性分区中的应用. 灾害学, 2008, 23(1): 73-75;87. doi: 10.3969/j.issn.1000-811X.2008.01.017Hu J, Li B, Yang Y F. The application of GIS in geological hazard susceptibility zonation in Ludian County, Yunnan. J Catastrophology, 2008, 23(1): 73-75;87. doi: 10.3969/j.issn.1000-811X.2008.01.017 [27] Shannon C. A mathematical theory of communication. Bell System Technical Journal, 1948, 27(4): 623-656. doi: 10.1002/j.1538-7305.1948.tb00917.x [28] 张以晨, 秦胜伍, 翟健健, 等. 基于信息量的长白山地区泥石流易发性评价. 水文地质工程地质, 2018, 45(2): 150-158. https://www.cnki.com.cn/Article/CJFDTOTAL-SWDG201802023.htmZhang Y C, Qin S W, Zhai J J, et al. Susceptibility assessment of debris flow based on GIS and weight information for the Changbai Mountain area. Hydrogeology & Engineering Geology, 2018, 45(2): 150-158. https://www.cnki.com.cn/Article/CJFDTOTAL-SWDG201802023.htm [29] 张春山, 张业成, 马寅生. 黄河上游地区崩塌、滑坡、泥石流地质灾害区域危险性评价. 地质力学学报, 2003, 9(2): 143-153. doi: 10.3969/j.issn.1006-6616.2003.02.007Zhang C S, Zhang Y C, Ma Y S. Regional dangerous on the geological hazards of collapse, landslide and debris flows in the upper reaches of the Yellow River. J Geomech, 2003, 9(2): 143-153. doi: 10.3969/j.issn.1006-6616.2003.02.007 [30] Bennett G L, Miller S R, Roering J J, et al. Threshold slopes and the survival of relict terrain in the wake of the mendocino triple junction. Geology, 2016, 44(5): 363-366. doi: 10.1130/G37530.1 [31] Kornejady A, Ownegh M, Bahremand A. Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. Catena, 2017, 152: 144-162. doi: 10.1016/j.catena.2017.01.010 [32] Chen W, Pourghasemi H R, Zhao Z. A GIS-based comparative study of Dempster-Shafer, logistic regression and artificial neural network models for landslide susceptibility mapping. Geocarto Int, 2017, 32(4): 367-385. doi: 10.1080/10106049.2016.1140824 [33] Aleotti P. A warning system for rainfall-induced shallow failures. Engineering. Geology, 2004, 73: 247-265. [34] Pasuto A, Silvano S. Rainfall as a triggering factor of shallow mass movements. A case study in the Dolomites, Italy. Environ Geol, 1998, 35(2/3): 184-189. [35] 刘海知, 马振峰, 范广洲. 四川典型区域滑坡泥石流与降雨的关系. 水土保持通报, 2016, 36(6): 73-77.Liu H Z, Ma Z F, Fan G Z. Relationship between landslide/debris flow and rainfall in typical region of Sichuan Province. Bull Soil Water Conserv, 2016, 36(6): 73-77. [36] 周志华. 机器学习. 北京: 清华大学出版社, 2016.Zhou Z H. Machine Learning. Bejing: Tsinghua University Press, 2016. [37] Zhai P, Zhang X, Wan H, et al. Trends in total precipitation and frequency of daily precipitation extremes over China. J Climate, 2005, 18(7): 1096-1108. doi: 10.1175/JCLI-3318.1 [38] Zhang H, Zhai P M. Temporal and spatial characteristics of extreme hourly precipitation over eastern China in the warm season. Adv Atmos Sci, 2011, 28(5): 1177-1183. doi: 10.1007/s00376-011-0020-0 [39] Xiao C, Wu P, Zhang L, et al. Robust increase in extreme summer rainfall intensity during the past four decades observed in China. Sci Rep, 2016, 6: 38506. doi: 10.1038/srep38506 [40] Wu M, Luo Y, Chen F, et al. Observed link of extreme hourly precipitation changes to urbanization over coastal South China. J Appl Meteor Climatol, 2019, 58(8): 1799-1819. doi: 10.1175/JAMC-D-18-0284.1 [41] Wang Y, Zhou L. Observed trends in extreme precipitation events in China during 1961-2001 and the associated changes in large-scale circulation. Geophys Res Lett, 2005, 32(9): L09707. DOI: 10.1029/2005GL22574. [42] Luo Y, Wu M, Ren F, et al. Synoptic situations of extreme hourly precipitation over China. J Climate, 2016, 29(24): 8703-8719. doi: 10.1175/JCLI-D-16-0057.1 [43] Guzzetti F, Peruccacci S, Rossi M, et al. Rainfall thresholds for the initiation of landslides. Meteor Atmos Phys, 2007, 98(3/4): 239-267. [44] Wang Y Q. An open source software suite for multi-dimensional meteorological data computation and visualisation. J Syst Software, 2019, 7(3). DOI: 10.5334/jors.267. [45] Wang Y Q. MeteoInfo: GIS software for meteorological data visualization and analysis. Meteor Appl, 2014, 21(2): 360-368. doi: 10.1002/met.1345 [46] 韩丰, 杨璐, 周楚炫, 等. 基于探空数据集成学习的短时强降水预报试验. 应用气象学报, 2021, 32(2): 188-199. doi: 10.11898/1001-7313.20210205Han F, Yang L, Zhou C X, et al. An experimental study of the short-time heavy rainfall event forecast based on ensemble learning and sounding data. J Appl Meteor Sci, 2021, 32(2): 188-199. doi: 10.11898/1001-7313.20210205