An Experimental Study of the Short-time Heavy Rainfall Event Forecast Based on Ensemble Learning and Sounding Data
-
摘要: 使用2015—2019年6—9月08:00(北京时)我国119个探空站的大气层结和对流参数作为特征参数,基于XGBoost集成学习方法,建立短时强降水事件预报模型。同时,提出一种面向高影响天气的模型优化思路,通过使用分段权重损失函数,进行模型调优,在空报率不超过一定阈值的情况下,可提升模型预报的命中率和TS评分。设计分段权重损失函数权重敏感性试验和损失函数对比试验,选取7个区域中心探空站对比分析模型优化方法的有效性和泛化性。利用2019年6—9月全国探空数据针对短时强降水预报开展批量独立检验和个例分析,结果表明:改进后的预报模型TS评分提高0.05~0.1,命中率提高0.15以上,空报率提高0.05~0.1,表现出明显的“宁空勿漏”预报倾向,模型预报能力得到明显提升;在全国短时强降水预报试验中,预报模型命中率为0.65,空报率为0.37,漏报率为0.34,TS评分为0.47,说明该模型对短时强降水天气具有一定预报能力。Abstract: Sounding analysis is one important method for short-term heavy rainfall event forecasting. By using sounding data of 119 stations at 0800 BT of 1 June -30 September during 2015-2019, based on XGBoost integrated learning framework, a prediction model for short-term heavy rainfall events (not less than 20 mm·h-1) is proposed. Sounding data and derivative physical elements are used as characteristics parameters. The model can forecast whether short-term heavy rainfall occurs around the sounding station in following 12 h. Then an optimization method of high-risk weather is proposed. Using piecewise cost function as a loss function, different weighting factors are used to make the model more sensitive. This will ensure the total number of false prediction samples do not increase, but more false alarms rather than missing ones, leading to a slight increase on threat score (TS), a great improvement on probability of detection (POD) and the false alarm rate (FAR) will not exceed the threshold such as 0.5. After that, two tests are designed including a weighted sensitivity test for the piecewise loss function and a comparison test of the loss function using 12 datasets of 7 regional center sounding stations. The efficiency of the model optimization method is verified and the prediction ability before and after the improvement are compared. At last, a test of national short-term heavy precipitation forecast is designed, by using sounding data from 1 June to 30 September in 2019 as independent test set. Results show that reducing wTP will decrease the number of hits and false alarm of the model's forecast; reducing wFN will increase the number of hits and false alarms; wTN and wFP have little influence on the prediction. Compare with other commonly used cost function, the model with piecewise weight cost function has better forecasting skills, in which the TS is improved by 0.05-0.1, the POD is increased by more than 0.15, and the FAR is improved by 0.05-0.1. The model shows a clear tendency of forecasting positive instead of missing. In addition, the model shows similar results in all independent experiments, indicating that the optimization method has consistent effects on the results. The independent test of the national short-term heavy rainfall forecast experiment shows that the improved model has a certain short-term heavy rainfall forecast ability, with POD of 0.66, FAR of 0.37, and TS of 0.47. Above all, a short-term heavy rain prediction model is constructed based on the integrated decision tree and sounding data. The optimization method which could enhance the forecast skill of model is also proposed and verified.
-
表 1 模型预测值和样本真实值的关系
Table 1 Relations between labels and predictions
实况 预测 发生(positive) 未发生(negative) 发生(true) TP TN 未发生(false) FP FN 表 2 各探空站点试验数据子集名称
Table 2 Data subset of sounding stations
站点试验数据子集名称 学习集 学习集事件发生率 独立检验集 检验集事件发生率 试验2019 2015—2018年6—9月 0.234 2019年6—9月 0.179 试验2018 2015—2017年6—9月
2019年6—9月0.214 2018年6—9月 0.259 试验2017 2015—2016年6—9月
2018—2019年6—9月0.217 2017年6—9月 0.241 表 3 特征量列表
Table 3 Selected elements
序号 特征名称 序号 特征名称 序号 特征名称 1~5 地面层观测* 33 对流有效位能 41 0~1 km风切变 6~10 925 hPa观测* 34 对流抑制有效位能 42 0~3 km风切变 11~15 850 hPa观测* 35 下沉对流有效位能 43 0~6 km风切变 16~20 700 hPa观测* 36 暖云层厚度 44 0~8 km风切变 21~25 500 hPa观测* 37 整层比湿积分 45 700 hPa和500h Pa温度差 26~30 400 hPa观测* 38 湿层厚度 46 850 hPa和500 hPa温度差 31 -20℃层高度 39 K指数 47 总指数 32 最优抬升指数 40 抬升指数 48 湿球温度0℃层高度 注:*包括温度、位势高度、露点温度、风速和风向要素。 表 4 XGBoost模型参数
Table 4 Parameters of XGBoost
中文名 参数值 模型 gbtree 学习率 0.15 最小叶子节点权重和 4 树的最大深度 7 随机采样率 0.75 随机数种子 10 表 5 站点平均检验结果
Table 5 Average result of comparison test of loss function at each sounding station
站点 损失函数 TS评分 命中率 空报率 检验集短时强降水事件总数 检验集短时强降水事件频率 北京 分段权重损失函数 0.46 0.70 0.44 86 0.235 MSE 0.38 0.49 0.35 Logloss 0.38 0.51 0.41 清远 分段权重损失函数 0.79 0.98 0.19 266 0.727 MSE 0.76 0.90 0.16 Logloss 0.69 0.81 0.17 温江 分段权重损失函数 0.59 0.85 0.34 130 0.358 MSE 0.55 0.67 0.22 Logloss 0.52 0.63 0.21 上海 分段权重损失函数 0.51 0.80 0.42 121 0.340 MSE 0.46 0.64 0.39 Logloss 0.46 0.66 0.40 渝中 分段权重损失函数 0.31 0.38 0.36 46 0.126 MSE 0.24 0.27 0.32 Logloss 0.20 0.23 0.39 武汉 分段权重损失函数 0.49 0.73 0.39 116 0.318 MSE 0.44 0.56 0.31 Logloss 0.41 0.55 0.35 锦州 分段权重损失函数 0.31 0.49 0.54 68 0.193 MSE 0.19 0.26 0.57 Logloss 0.22 0.28 0.49 表 6 2019年检验集长检验结果
Table 6 Quantitative validation of prediction model on 2019 dataset
方法 命中率 空报率 TS评分 集成学习预测模型 0.66 0.37 0.47 GRAPES_3 km预报 0.70 0.53 0.39 -
[1] 章丽娜,王秀明,熊秋芬,等."6.23"北京对流暴雨中尺度环境时空演变特征及影响因子分析.暴雨灾害,2014,33(1):1-9. https://www.cnki.com.cn/Article/CJFDTOTAL-HBQX201401001.htmZhang L N, Wang X M, Xiong Q F, et al. On the evolution of mesoscale environment and influential factors of the heavy rainfall in Beijing on 23 June 2011. Torrential Rain and Disasters, 2014, 33(1): 1-9. https://www.cnki.com.cn/Article/CJFDTOTAL-HBQX201401001.htm [2] 何立富, 周庆亮, 谌芸, 等. 国家级强对流潜势预报业务进展与检验评估. 气象, 2011, 37(7): 777-784. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201107002.htmHe L F, Zhou Q L, Shen Y, et al. Introduction and examination of potential forecast for strong convective weather at national level. Meteorological Monthly, 2011, 37(7): 777-784. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201107002.htm [3] 陈明轩, 俞小鼎, 谭晓光, 等. 对流天气临近预报技术的发展与研究进展. 应用气象学报, 2004, 15(6): 754-766. http://qikan.camscma.cn/article/id/20040693Chen M X, Yu X D, Tan X G, et al. A brief review on the development of nowcasting for convective storms. J Appl Meteor Sci, 2004, 15(6): 754-766. http://qikan.camscma.cn/article/id/20040693 [4] 杨波, 孙继松, 毛旭, 等. 北京地区短时强降水过程的多尺度环流特征. 气象学报, 2016, 74(6): 919-934. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201606008.htmYang B, Sun J S, Mao X, et al. Multi-scale characteristics of atmospheric circulation related to short-time strong rainfall events in Beijing. Acta Meteor Sinica, 2016, 74(6): 919-934. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201606008.htm [5] Doswell C A, Brooks H E, Maddox R A. Flash flood forecasting: An ingredients-based methodology. Wea Forecasting, 1996(11): 560-581. http://ci.nii.ac.jp/naid/80009407343 [6] 张小玲, 陶诗言, 孙建华. 基于"配料"的暴雨预报. 大气科学, 2010, 34(4): 754-766. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXK201004009.htmZhang X L, Tao S Y, Sun J H. Ingredients-based heavy rainfall forecasting. Chinese Journal of Atmospheric Sciences, 2010, 34(4): 754-766. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXK201004009.htm [7] 张涛, 蓝逾, 毛冬艳, 等. 国家级中尺度天气分析业务技术进展I: 对流天气环境场分析业务技术规范的改进与产品集成系统支撑技术. 气象, 2013, 39(7): 894-900. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201307011.htmZhang T, Lan Y, Mao D Y, et al. Advances of mesoscale convective weather analysis in NMC I: Convective weather environment analysis and supporting techniques. Meteorological Monthly, 2013, 39(7): 894-900. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201307011.htm [8] 蓝渝, 张涛, 郑永光, 等. 国家级中尺度天气分析业务技术进展Ⅱ: 对流天气中尺度过程分析规范和技术支撑. 气象, 2013, 39(7): 901-910. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201307011.htmLan Y, Zhang T, Zheng Y G, et al. Advances of mesoscale convective weather analysis in NMC Ⅱ: Mesoscale nowcasting analysis and supporting techniques. Meteorological Monthly, 2013, 39(7): 901-910. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201307011.htm [9] 魏东, 孙继松, 雷蕾, 等. 三种探空资料在各类强对流天气中的应用对比分析. 气象, 2011, 37(4): 412-422. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201104006.htmWei D, Sun J S, Lei L, et al. Comparative Analysis of three kinds of sounding data in the application of the severe convective weather. Meteorological Monthly, 2011, 37(4): 412-422. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201104006.htm [10] 王黉, 李英, 宋丽莉. 川藏地区雷暴大风活动特征和环境因子对比. 应用气象学报, 2020, 31(4): 435-446. doi: 10.11898/1001-7313.20200406Wang H, Li Y, Song L L. Comparison of characteristics and environmental factors of thunderstorm gales over the Sichuan-Tibet region. J Appl Meteor Sci, 2020, 31(4): 435-446. doi: 10.11898/1001-7313.20200406 [11] 王笑芳, 丁一汇. 北京地区强对流天气短时预报方法的研究. 大气科学, 1994, 12(2): 173-183. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXK199402004.htmWang X F, Ding Y H. Study on method of short-range forecast of severe convective weather in Beijing area. Scientia Atmospherica Sinica, 1994, 12(2): 173-183. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXK199402004.htm [12] 刘玉玲. 对流参数在强对流天气潜势能预测中的作用. 气象科技, 2003, 31(3): 147-151. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKJ200303003.htmLiu Y L. A study of severe convective parameters and their potential predictability to severe convective storms. Meteorological Science and Technology, 2003, 31(3): 147-151. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKJ200303003.htm [13] 李耀东, 高守亭, 刘健文. 对流能量计算及强对流天气落区预报技术研究. 应用气象学报, 2004, 15(1): 10-20. http://qikan.camscma.cn/article/id/20040102Li Y D, Gao S T, Liu J W. A calculation of energy and the method of severe weather forecasting. J Appl Meteor Sci, 2004, 15(1): 10-20. http://qikan.camscma.cn/article/id/20040102 [14] 刘晓璐, 刘建西, 张世林, 等. 基于探空资料因子组合分析方法的冰雹预报. 应用气象学报, 2014, 25(2): 168-175. http://qikan.camscma.cn/article/id/20140206Liu X L, Liu J X, Zhang S L, et al. Hail forecast based on factor combination analysis method sounding data. J Appl Meteor Sci, 2014, 25(2): 168-175. http://qikan.camscma.cn/article/id/20140206 [15] 雷蕾, 孙继松, 魏东. 利用探空资料判别北京地区夏季强对流的天气类别. 气象, 2011, 37(2): 136-141. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201102003.htmLei L, Sun J S, Wei D. Distinguishing the category of the summer convective weather by sounding data in Beijing. Meteorological Monthly, 2011, 37(2): 136-141. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201102003.htm [16] 马淑萍, 王秀明, 俞小鼎. 极端雷暴大风的环境参量特征. 应用气象学报, 2019, 30(3): 292-301. doi: 10.11898/1001-7313.20190304Ma S P, Wang X M, Yu X D. Environmental parameter characteristics of severe wind with extreme thunderstorm. J Appl Meteor Sci, 2019, 30(3): 292-301. doi: 10.11898/1001-7313.20190304 [17] 田付友, 郑永光, 张涛, 等. 短时强降水诊断物理量敏感性的点对面检验. 应用气象学报, 2015, 26(4): 385-396. doi: 10.11898/1001-7313.20150401Tian F Y, Zheng Y G, Zhang T, et al. Sensitivity analysis of short-duration heavy rainfall related diagnostic parameters with point-area verification. J Appl Meteor Sci, 2015, 26(4): 385-396. doi: 10.11898/1001-7313.20150401 [18] 曾明剑, 王桂臣, 吴海英, 等. 基于中尺度数值模式的分类强对流天气预报方法研究. 气象学报, 2015, 73(5): 868-882. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201505005.htmZeng M J, Wang G C, Wu H Y, et al. Study of the forecasting method for the classified severe convection weather based on a meso-scale numerical model. Acta Meteor Sinica, 2015, 73(5): 868-882. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201505005.htm [19] 俞小鼎. 基于构成要素的预报方法——配料法. 气象, 2011, 37(8): 913-918. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201108002.htmYu X D. Ingredients based forecasting methodology. Meteorological Monthly, 2011, 37(8): 913-918. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201108002.htm [20] 孙继松, 陶祖钰. 强对流天气分析与预报中的若干基本问题. 气象, 2012, 38(2): 164-173. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201202007.htmSun J S, Tao Z Y. Some essential issues connected with severe convective weather analysis and forecast. Meteorological Monthly, 2012, 38(2): 164-173. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201202007.htm [21] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System//International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 785-794. doi: 10.1145/2939672.2939785 [22] 李颖, 陈怀亮. 机器学习技术在现代农业气象中的应用. 应用气象学报, 2020, 31(3): 257-266. doi: 10.11898/1001-7313.20200301Li Y, Chen H L. Review of machine learning approaches for modern agrometeorology. J Appl Meteor Sci, 2020, 31(3): 257-266. doi: 10.11898/1001-7313.20200301 [23] Shi X J, Chen Z R, Wang H, et al.Deep Learning for Precipitation Nowcasting: A Benchmark and a New Model//31st Conference on Neural Information Processing Systems.Long Beach, CA, USA, 2016. [24] 韩丰, 龙明盛, 李月安, 等. 循环神经网络在雷达临近预报中的应用. 应用气象学报, 2019, 30(1): 61-69. doi: 10.11898/1001-7313.20190106Han F, Long M S, Li Y A, et al. The application of recurrent neural network to nowcasting. J Appl Meteor Sci, 2019, 30(1): 61-69. doi: 10.11898/1001-7313.20190106 [25] 郭瀚阳, 陈明轩, 韩雷, 等. 基于深度学习的强对流高分辨率临近预报试验. 气象学报, 2019, 77(4): 715-727. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201904009.htmGuo H Y, Chen M X, Han L, et al. High resolution nowcasting experiment of severe convections based on deep learning. Acta Meteor Sinica, 2019, 77(4): 715-727. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201904009.htm [26] 唐文苑, 周庆亮, 刘鑫华, 等. 国家级强对流天气分类预报检验分析. 气象, 2017, 43(1): 67-76. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201701007.htmTang W Y, Zhou Q L, Liu X H, et al. Anlysis on verification of national severe convective weather categorical forecasts. Meteorological Monthly, 2017, 43(1): 67-76. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201701007.htm [27] 章宁, 陈钦. 基于AUC及Q统计值的集成学习训练方法. 计算机应用, 2019, 39(4): 935-939. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201904002.htmZhang N, Chen Q. Ensemble learning training method based on AUC and Q statistics. Journal of Computer Application, 2019, 39(4): 935-939. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY201904002.htm [28] Friedman J H. Stochastic Gradient Boosting. Computational Statistics and Data Analysis, 2002, 28: 367-378. [29] Haklander A J, Delden A V. Thunderstorm predictors and their forecast skill for the Netherlands. Atmos Res, 2003, 67/68: 273-299. http://www.sciencedirect.com/science/article/pii/S0169809503000565 [30] 俞小鼎. 短时强降水临近预报的思路与方法. 暴雨灾害, 2013, 32(3): 202-209. https://www.cnki.com.cn/Article/CJFDTOTAL-HBQX201303003.htmYu X D. Nowcasting thinking and method of flash heavy rain. Torrential Rain and Disasters, 2013, 32(3): 202-209. https://www.cnki.com.cn/Article/CJFDTOTAL-HBQX201303003.htm [31] 王秀明, 俞小鼎, 周小刚. 雷暴潜势预报中几个基本问题的讨论. 气象, 2014, 40(4): 389-399. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201404001.htmWang X M, Yu X D, Zhou X G. Discussion on basical issues of thunderstorm potential forecasting. Meteorological Monthly, 2014, 40(4): 389-399. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201404001.htm [32] 郑永光, 陶祖钰, 俞小鼎. 强对流天气预报的一些基本问题. 气象, 2017, 43(6): 641-652. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201706001.htmZheng Y G, Tao Z Y, Yu X D. Some essential issues of severe convective weather forecasting. Meteorological Monthly, 2017, 43(6): 641-652. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201706001.htm [33] 郑永光, 周康辉, 盛杰, 等. 强对流天气监测预报预警技术进展. 应用气象学报, 2015, 26(6): 641-657. doi: 10.11898/1001-7313.20150601Zheng Y G, Zhou K H, Sheng J, et al. Advances in techniques of monitoring, forecasting and warning of severe convective weather. J Appl Meteor Sci, 2015, 26(6): 641-657. doi: 10.11898/1001-7313.20150601 [34] 刘海知, 何立富. 2019年6月大气环流和天气分析. 气象, 2019, 45(9): 1335-1340. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201909013.htmLiu H Z, He L F. Analysis of the June 2019 atmospheric circulation and weather. Meteorological Monthly, 2019, 45(9): 1335-1340. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201909013.htm