Precipitation Forecast Correction in South China Based on SVD and Machine Learning
-
摘要: 降水是在多种天气系统和复杂物理过程共同影响下形成的,因此降水预报难度较大。由于数值预报模式的局限性,使得模式预报产品存在一定误差。为探讨更加有效的模式预报产品误差订正方法,基于奇异值分解(SVD)与机器学习(多元线性回归、套索回归、岭回归)构建订正模型,对2007—2019年4月1日—6月30日华南前汛期欧洲中期天气预报中心(EC)模式降水预报产品进行误差订正试验。结果表明:基于SVD与机器学习相结合的订正模型能有效降低EC模式降水预报产品在华南的预报误差,均方根误差最大优化率达4.2%,累计超过69%的站点得到不同程度的优化;SVD与机器学习相结合的订正模型能很好地处理因子间共线性问题,具有更好的鲁棒性;而对多个订正模型加权集成,均方根误差优化率达5.7%,累计超过77%的站点得到优化,显然加权集成方法订正效果不仅优于EC模式预报产品,也优于参与集成的任一订正模型。Abstract:
Precipitation can be induced by various weather systems and a series of complex physical processes, so its prediction is relatively difficult in weather forecasting. Due to the limitation of numerical model, the prediction error is inevitable. It is a hot topic in meteorological research and operation to explore a more effective method to correct the model product, and to improve the interpretation and applicability. To explore a more effective model product error correction method, a combination of correction methods is put forward, based on singular value decomposition (SVD) and machine learning, including multiple linear regression, LASSO regression and Ridge regression. The results are compared with the traditional matrix coefficient method, and then correction models are tested in pre-flood season precipitation forecast in South China, by correcting European Centre for Medium-Range Weather Forecasts (EC) product. The result shows that the proposed correction models combining SVD and machine learning can effectively reduce the error of EC product. The maximum optimization rate root mean square error is 4.2%, and more than 69% of the stations are optimized to different degrees. These correction models have better robustness to deal with the problem of collinearity between factors, and the correction effect is better than that of the traditional matrix coefficient method. Furthermore, the weighted integration of multiple correction models is carried out by assigning different weights to different models, and the root mean square error by the integrated approach in South China is smaller than EC product and any single correction model. It shows that the weighted ensemble method can better integrate the advantages of multiple correction models and enlarge the advantages. For the weighted ensemble of multiple correction models, it is not only better than the precipitation prediction results of EC product, but also better than any one of the integrated correction models. Its optimization rate of root mean squared error can achieve 5.7%, and more than 77% of the stations are optimized to different degrees.
-
表 1 前10个模态累计方差贡献
Table 1 Cumulative variance contribution of the top 10 modes
模态序号 累计方差贡献 1 0.5162 2 0.6220 3 0.6763 4 0.7220 5 0.7516 6 0.7716 7 0.7888 8 0.8014 9 0.8112 10 0.8197 表 2 不同模型(方法)订正效果
Table 2 Correction effect in different models and method
模型 均方根误差/(mm·d-1) 优化率/% 模型Ⅳ 13.26 4.1 模型Ⅱ 13.26 4.1 模型Ⅲ 13.24 4.2 模型Ⅰ 13.24 4.2 加权集成方法 13.01 5.7 -
[1] Luo Y L, Zhang R H, Wan Q L, et al. The Southern China monsoon rainfall experiment (SCMREX). Bull Amer Meteor Soc, 2016, 98(5): 999-1013. [2] Wilks D S. Comparison of ensemble-MOS methods in the Lorenz '96 setting. Meteor Appl, 2006, 13(3): 243-256. doi: 10.1017/S1350482706002192 [3] Scheuerer M, Hamill T M. Statistical postprocessing of ensemble precipitation forecasts by fitting censored, shifted Gamma distributions. Mon Wea Rev, 2015, 143(11): 4578-4596. doi: 10.1175/MWR-D-15-0061.1 [4] 叶笃正, 严中伟, 戴新刚, 等. 未来的天气气候预测体系. 气象, 2006, 32(4): 3-8. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX200604000.htmYe D Z, Yan Z W, Dai X G, et al. A Discussion of future system of weather and climate prediction. Meteor Mon, 2006, 32(4): 3-8. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX200604000.htm [5] 何立富, 陈涛, 孔期. 华南暖区暴雨研究进展. 应用气象学报, 2016, 27(5): 559-569. doi: 10.11898/1001-7313.20160505He L F, Chen T, Kong Q. A review of studies on prefrontal torrential rain in South China. J Appl Meteor Sci, 2016, 27(5): 559-569. doi: 10.11898/1001-7313.20160505 [6] 吴乃庚, 温之平, 邓文剑, 等. 华南前汛期暖区暴雨研究新进展. 气象科学, 2020, 40(5): 605-616. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKX202005005.htmWu N G, Wen Z P, Deng W J, et al. Advances in warm-sector heavy rainfall during the first rainy season in South China. J Meteor Sci, 2020, 40(5): 605-616. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKX202005005.htm [7] 丁一汇. 中国暴雨理论的发展历程与重要进展. 暴雨灾害, 2019, 38(5): 395-406. https://www.cnki.com.cn/Article/CJFDTOTAL-HBQX201905002.htmDing Y H. The major advances and development process of the theory of heavy rainfalls in China. Torrential Rain and Disasters, 2019, 38(5): 395-406. https://www.cnki.com.cn/Article/CJFDTOTAL-HBQX201905002.htm [8] Applequist S, Gahrs G E, Pfeffer R L, et al. Comparison of methodologies for probabilistic quantitative precipitation forecasting. Wea Forecasting, 1991, 17(4): 783-799. [9] 毕宝贵, 代刊, 王毅, 等. 定量降水预报技术进展. 应用气象学报, 2016, 27(5): 534-549. doi: 10.11898/1001-7313.20160503Bi B G, Dai K, Wang Y, et al. Advances in techniques of quantitative precipitation forecast. J Appl Meteor Sci, 2016, 27(5): 534-549. doi: 10.11898/1001-7313.20160503 [10] 李莉, 朱跃建. T213降水预报订正系统的建立与研究. 应用气象学报, 2006, 17(增刊Ⅰ): 130-134. https://www.cnki.com.cn/Article/CJFDTOTAL-YYQX2006S1018.htmLi L, Zhu Y J. The establishment and research of T213 precipitation calibration system. J Appl Meteor Sci, 2006, 17(Suppl Ⅰ): 130-134. https://www.cnki.com.cn/Article/CJFDTOTAL-YYQX2006S1018.htm [11] 孙靖, 程光光, 张小玲. 一种改进的数值预报降水偏差订正方法及应用. 应用气象学报, 2015, 26(2): 173-184. doi: 10.11898/1001-7313.20150205Sun J, Cheng G G, Zhang X L. An improved bias removed method for precipitation prediction and its application. J Appl Meteor Sci, 2015, 26(2): 173-184. doi: 10.11898/1001-7313.20150205 [12] 朱乾根, 陈晓光. 我国降水自然区域的客观划分. 南京气象学院学报, 1992, 15(4): 467-475. https://www.cnki.com.cn/Article/CJFDTOTAL-NJQX199204002.htmZhu Q G, Chen X G. Objective division of natural rainfall regions in China. J Nanjing Ins Meteor, 1992, 15(4): 467-475. https://www.cnki.com.cn/Article/CJFDTOTAL-NJQX199204002.htm [13] 苏海晶, 王启光, 杨杰, 等. 基于奇异值分解对中国夏季降水模式误差订正的研究. 物理学报, 2013, 62(10): 494-503. https://www.cnki.com.cn/Article/CJFDTOTAL-WLXB201310076.htmSu H J, Wang Q G, Yang J, et al. Error correction on summer model precipitation of China based on the singular value decomposition. Acta Phys Sinica, 2013, 62(10): 494-503. https://www.cnki.com.cn/Article/CJFDTOTAL-WLXB201310076.htm [14] 邱崇践, 丑纪范. 天气预报的相似-动力方法. 大气科学, 1989, 13(1): 22-28. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXK198901002.htmQiu C J, Chou J F. Similarity of weather forecast-dynamic method. Chinese J Atmos Sci, 1989, 13(1): 22-28. https://www.cnki.com.cn/Article/CJFDTOTAL-DQXK198901002.htm [15] 任宏利, 丑纪范. 统计-动力相结合的相似误差订正法. 气象学报, 2005, 63(6): 988-993. doi: 10.3321/j.issn:0577-6619.2005.06.015Ren H L, Chou J F. Analogue correction method of errors by combining both statistical and dynamical methods together. Acta Meteor Sinica, 2005, 63(6): 988-993. doi: 10.3321/j.issn:0577-6619.2005.06.015 [16] 王建新. 长江中下游地区梅雨期雨量场与500百帕月平均高度场的相关分析. 气象科学, 1989, 32(3): 311-321. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKX198903009.htmWang J X. The correlation analysis of precipitation fields in the middle and lower reaches in Changjiang River during Meiyu period and 500 hPa monthly mean height fields. Scientia Meteorologica Sinica, 1989, 32(3): 311-321. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKX198903009.htm [17] 刘宗秀, 廉毅, 沈柏竹, 等. 北太平洋涛动区500 hPa高度场季节变化特征及其对中国东北区降水的影响. 应用气象学报, 2003, 14(5): 553-561. doi: 10.3969/j.issn.1001-7313.2003.05.005Liu Z X, Lian Y, Shen B Z, et al. Seasonal variation features of 500 hPa height in North Pacific oscillation region and its effect on precipitation in Northeast China. J Appl Meteor Sci, 2003, 14(5): 553-561. doi: 10.3969/j.issn.1001-7313.2003.05.005 [18] 尤凤春, 丁裕国, 周煜, 等. 奇异值分解和奇异交叉谱分析方法在华北夏季降水诊断中的应用. 应用气象学报, 2003, 14(2): 176-187. doi: 10.3969/j.issn.1001-7313.2003.02.005You F C, Ding Y G, Zhou Y, et al. Applicability of singular value decomposition and singular cross-spectrum to diagnose of rainfall in North China. J Appl Meteor Sci, 2003, 14(2): 176-187. doi: 10.3969/j.issn.1001-7313.2003.02.005 [19] 秦正坤, 林朝晖, 陈红, 等. 基于EOF/SVD的短期气候预测误差订正方法及其应用. 气象学报, 2011, 69(2): 289-296. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201102008.htmQin Z K, Lin Z H, Chen H, et al. EOF/SVD-based short-term climate prediction error correction method and its application. Acta Meteor Sinica, 2011, 69(2): 289-296. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201102008.htm [20] 刘甲毅, 邓丽姣, 傅国斌, 等. 两种统计降尺度方法在秦岭山地的适用性. 应用气象学报, 2018, 29(6): 737-747. doi: 10.11898/1001-7313.20180609Liu J Y, Deng L J, Fu G B, et al. The applicability of two statistical down-scaling methods to the Qinling Mountains. J Appl Meteor Sci, 2018, 29(6): 737-747. doi: 10.11898/1001-7313.20180609 [21] Ahijevych D, Pinto J O, Williams J K, et al. Probabilistic forecasts of mesoscale convective system initiation using the Random Forest data mining technique. Wea Forecasting, 2016, 31(2): 581-599. doi: 10.1175/WAF-D-15-0113.1 [22] 林健玲, 金龙, 彭海燕. 区域降水数值预报产品人工神经网络释用预报研究. 气象科技, 2006, 34(1): 12-17. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKJ200601002.htmLin J L, Jin L, Peng H Y. Application of numerical forecast products to regional rainfall forecasting by artificial neural network. Meteor Sci Technol, 2006, 34(1): 12-17. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKJ200601002.htm [23] Krishnamurti T N, Kishtawal C M, Larow T E, et al. Improved weather and seasonal climate forecasts from multi-model super-ensemble. Science, 1999, 285(5433): 1548-1550. doi: 10.1126/science.285.5433.1548 [24] Li S, Wang Y, Yuan H, et al. Ensemble mean forecast skill and applications with the T213 ensemble prediction system. Adv Atmos Sci, 2016, 33(11): 1297-1305. doi: 10.1007/s00376-016-6155-2 [25] 马清, 龚建东, 李莉, 等. 超级集合预报的误差订正与集成研究. 气象, 2008, 34(3): 42-48. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX200803009.htmMa Q, Gong J D, Li L, et al. Study of bias-correction and consensus in regional multi-model super-ensemble forecast. Meteor Mon, 2008, 34(3): 42-48. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX200803009.htm [26] 智协飞, 赵忱. 基于集合成员订正的强降水多模式集成预报. 应用气象学报, 2020, 31(3): 303-314. doi: 10.11898/1001-7313.20200305Zhi X F, Zhao C. Heavy precipitation forecasts based on multi-model ensemble members. J Appl Meteor Sci, 2020, 31(3): 303-314. doi: 10.11898/1001-7313.20200305 [27] 陈昱文, 黄小猛, 李熠, 等. 基于ECMWF产品的站点气温预报集成学习误差订正. 应用气象学报, 2020, 31(4): 494-503. doi: 10.11898/1001-7313.20200411Chen Y W, Huang X M, Li Y et al. Ensemble learning for bias correction of station temperature forecast based on ECMWF products. J Appl Meteor Sci, 2020, 31(4): 494-503. doi: 10.11898/1001-7313.20200411 [28] 魏凤英. 现代气候统计诊断与预测技术. 北京: 气象出版社, 1999.Wei F Y. Modern Statistical Technnology in Climatological Diagnoses and Prediction. Beijing: China Meteorological Press, 1999. [29] 黄嘉佑. 气象统计分析与预报方法. 北京: 气象出版社, 2000.Huang J Y. Statistic Analysis and Forecast Methods in Meteorology. Beijing: China Meteorological Press, 2000. [30] 施能. 气象科研与预报中的多元分析方法. 北京: 气象出版社, 2002.Shi N. Multi-analysis in Meteorological Research and Prediction. Beijing: China Meteorological Press, 2002. [31] Abramson N, Braverman D J, Sebestyen G S. Pattern recognition and machine learning. Pub ASA, 2006, 103(4): 886-887. [32] Hart P E. The condensed nearest neighbor rule. IEEE Tran Inf Theory, 1968, 14(3): 515-516. doi: 10.1109/TIT.1968.1054155 [33] Xi C, Ishwaran H. Random forests for genomic data analysis. Genomics, 2012, 99(6): 323-329. doi: 10.1016/j.ygeno.2012.04.003 [34] Hu H Y, Lee Y C, Yen T M, et al. Using BPNN and DEMATEL to modify importance-performance analysis model-A study of the computer industry. Exp Sys Appl, 2009, 36(6): 9969-9979. doi: 10.1016/j.eswa.2009.01.062 [35] Aheto J M K, Duah H O, Agbadi P, et al. A predictive model, and predictors of under-five child malaria prevalence in Ghana: How do LASSO, Ridge and Elastic net regression approaches compare?. Prev Med Rep, 2021, 25: 101475. [36] Melkumova L E, Shatskikh S Y. Comparing Rridge and LASSO estimators for data analysis. Procedia Engineering, 2017, 201: 746-755. doi: 10.1016/j.proeng.2017.09.615