Machine Learning Correction of Wind, Temperature and Humidity Elements in Beijing-Tianjin-Hebei Region
-
摘要: 基于线性回归方法、梯度提升回归方法(GBRT方法)、XGBoost方法和堆叠集成学习方法(Stacking方法)4种机器学习方法,采用误差分析建模思路,针对北京城市气象研究院研发的睿图-睿思系统对2020年12月—2021年11月所有起报时次未来3~12 h的2 m温度、2 m相对湿度、10 m风速以及10 m风向4种气象要素预报,开展京津冀复杂地形下的站点预报误差订正技术研究及试验应用。结果表明:基于预报误差分析构建的4种订正模型中,由于Stacking方法集成了前3种方法的优势,在4个季节的4种气象要素订正中均表现最佳,其他3种单一机器学习方法试验中,XGBoost方法表现最佳,其后依次为GBRT方法、线性回归方法,但均对预报准确率有明显的正向提升效果。总体上,基于机器学习方法构建的预报误差订正模型可有效降低系统原始预报误差,有助于进一步提升复杂地形下站点客观释用产品的预报准确性。
-
关键词:
- 睿图-睿思;
- 机器学习;
- XGBoost方法;
- Stacking方法
Abstract: Weather conditions have an important impact on agricultural production, transportation, economic activities, so the improvement of forecast accuracy has been a constant concern of the society. After more than 100 years of continuous development, the accuracy of numerical weather model has been continuously improved, but there are still inevitable forecast errors. Therefore, it is an important issue worthy of study to improve the prediction accuracy by studying various error correction methods and post-processing the results of numerical weather prediction.Machine learning method is applied to revise four meteorological elements forecasted by RMAPS-RISE(rapid-update multi-scale analysis and prediction system-rapid integration and seamless ensemble) system developed by Beijing Institute of Urban Meteorology. First, the data are preprocessed by interpolating the system forecast data and extracting the data of each element site from the grid data. The observations of automatic weather stations and forecast data are processed to establish unified datasets for the application and modeling of machine learning. Second, linear regression method, gradient boosting regression method, XGBoost method and Stacking method are designed to combine various machine learning algorithms to improve the generalization ability of the model. In addition, an error analysis model is constructed according to four correction methods, and the correction technology research and experimental application of the forecast errors of each station's initial time under the complex terrain of Beijing-Tianjin-Hebei are carried out. Finally, the improvement of the revised forecast of different machine learning methods compared with the original RMAPS-RISE system forecast accuracy is compared.In the experimental part, two modeling ideas are proposed, and four machine learning methods are used to conduct correction and comparison experiments. It shows among the modeling ideas based on error analysis, the Stacking method has the best effect, effectively reducing the forecast error of the original system for the next 3-12 hours for 24 initial times. Among the other three single machine learning method, XGBoost method performs the best, followed by the gradient boosting regression method and linear regression method, and all of them have a significant positive effect on the prediction accuracy. Overall, the forecast error correction model based on machine learning methods can effectively reduce the original forecast error of RMAPS-RISE system, and they have broad application prospects in forecast correction. It is helpful to further improve the forecast accuracy of the objective interpretation product of the site under complex terrain.-
Key words:
- RMAPS-RISE;
- machine learning;
- XGBoost;
- Stacking
-
表 1 4种方法对2 m温度预报的季节订正结果
Table 1 Corrected results for 2 m temperature based on four methods in each season
统计量 方法 春季 夏季 秋季 冬季 均方根误差平均值/℃ 线性回归方法 2.16 1.93 1.93 2.40 GBRT方法 2.08 1.82 1.83 2.28 XGBoost方法 1.88 1.61 1.60 1.98 Stacking方法 1.55 1.44 1.44 1.77 均方根误差变化百分比/% 线性回归方法 -10.00 -15.72 -19.92 -19.19 GBRT方法 -13.00 -20.52 -24.07 -23.23 XGBoost方法 -21.67 -29.69 -33.61 -33.33 Stacking方法 -35.40 -37.12 -40.25 -40.40 注:RISE产品春季、夏季、秋季、冬季均方根误差平均值分别为2.40,2.29,2.41,2.97℃。 表 2 4种方法对2 m相对湿度的季节订正结果
Table 2 Corrected results for 2 m relative humidity based on four methods in each season
统计量 方法 春季 夏季 秋季 冬季 均方根误差平均值/% 线性回归方法 12.34 10.64 12.00 12.57 GBRT方法 11.72 9.85 11.12 11.61 XGBoost方法 10.24 8.66 9.69 10.18 Stacking方法 8.89 7.61 8.50 9.04 均方根误差变化百分比/% 线性回归方法 -11.35 -19.45 -21.41 -17.30 GBRT方法 -15.80 -25.44 -27.18 -23.62 XGBoost方法 -26.44 -34.44 -36.44 -33.03 Stacking方法 -36.14 -42.39 -44.34 -40.53 注:RISE产品春季、夏季、秋季、冬季均方根误差平均值分别为13.92%, 13.21%, 15.27%, 15.20%。 表 3 4种方法对10 m风速的季节订正结果
Table 3 Corrected results for 10 m wind speed based on four methods in each season
统计量 方法 春季 夏季 秋季 冬季 均方根误差平均值/(m·s-1) 线性回归方法 1.31 1.07 1.25 1.43 GBRT方法 1.26 1.01 1.08 1.31 XGBoost方法 1.12 0.89 0.94 1.13 Stacking方法 0.97 0.79 0.82 1.00 均方根误差变化百分比/% 线性回归方法 -8.39 -14.40 -19.87 -23.94 GBRT方法 -11.89 -19.20 -30.77 -30.32 XGBoost方法 -21.68 -28.80 -39.74 -39.89 Stacking方法 -32.17 -36.80 -47.44 -46.81 注:RISE产品春季、夏季、秋季、冬季均方根误差平均值分别为1.43,1.25,1.56,1.88 m·s-1。 表 4 4种方法对10 m风向的季节订正结果
Table 4 Corrected results for 10 m wind direction based on four methods in each season
统计量 方法 春季 夏季 秋季 冬季 平均绝对偏差平均值/(°) 线性回归方法 69.93 77.20 74.15 72.13 GBRT方法 68.47 75.16 72.90 69.60 XGBoost方法 63.10 69.70 68.20 63.96 Stacking方法 58.80 65.63 64.88 60.32 平均绝对偏差变化百分比/% 线性回归方法 -1.23 -4.43 -16.27 -8.98 GBRT方法 -3.29 -6.96 -17.68 -12.18 XGBoost方法 -10.88 -13.72 -22.99 -19.29 Stacking方法 -16.95 -18.75 -26.74 -23.89 注:RISE产品春季、夏季、秋季、冬季平均绝对偏差平均值分别为70.80°, 80.79°, 88.56°, 79.25°。 -
[1] 陈静, 陈德辉, 颜宏.集合数值预报发展与研究进展.应用气象学报, 2002, 13(4):497-507. doi: 10.3969/j.issn.1001-7313.2002.04.013Chen J, Chen D H, Yan H. A brief review on the development of ensemble prediction system. J Appl Meteor Sci, 2002, 13(4): 497-507. doi: 10.3969/j.issn.1001-7313.2002.04.013 [2] 李泽椿, 陈德辉. 国家气象中心集合数值预报业务系统的发展及应用. 应用气象学报, 2002, 13(1): 1-15. doi: 10.3969/j.issn.1001-7313.2002.01.001Li C Z, Chen D H. The development and application of the operational ensemble prediction system at National Meteorological Center. J Appl Meteor Sci, 2002, 13(1): 1-15. doi: 10.3969/j.issn.1001-7313.2002.01.001 [3] 张人禾, 沈学顺. 中国国家级新一代业务数值预报系统GRAPES的发展. 科学通报, 2008, 53(20): 2393-2395. doi: 10.3321/j.issn:0023-074X.2008.20.001Zhang R H, Shen X S. Development of China's national new generation operational numerical prediction system GRAPES. Chinese Sci Bull, 2008, 53(20): 2393-2395. doi: 10.3321/j.issn:0023-074X.2008.20.001 [4] 邓国, 龚建东, 邓莲堂, 等. 国家级区域集合预报系统研发和性能检验. 应用气象学报, 2010, 21(5): 513-523. doi: 10.3969/j.issn.1001-7313.2010.05.001Deng G, Gong J D, Deng L T. Development of mesoscale ensemble prediction system at National Meteorological Center. J Appl Meteor Sci, 2010, 21(5): 513-523. doi: 10.3969/j.issn.1001-7313.2010.05.001 [5] Glahn H R, Lowry D A. The use of model output statistics(MOS) in objective weather forecasting. J Appl Meteor, 1972, 11: 1208-1211. [6] 赵声蓉. 多模式温度集成预报. 应用气象学报, 2006, 17(1): 62-68. http://qikan.camscma.cn/article/id/20060109Zhao S R. Multi-model consensus forecast for temperature. J Appl Meteor Sci, 2006, 17(1): 62-68. http://qikan.camscma.cn/article/id/20060109 [7] 孙健, 曹卓, 李恒, 等. 人工智能技术在数值天气预报中的应用. 应用气象学报, 2021, 32(1): 1-11. doi: 10.11898/1001-7313.20210101Sun J, Cao Z, Li H, et al. Application of artificial intelligence technology to numerical weather prediction. J Appl Meteor Sci, 2021, 32(1): 1-11. doi: 10.11898/1001-7313.20210101 [8] 李颖, 陈怀亮. 机器学习技术在现代农业气象中的应用. 应用气象学报, 2020, 31(3): 257-266. doi: 10.11898/1001-7313.20200301Li Y, Chen H L. Review of machine learning appronches for modern agrometeorology. J Appl Meteor Sci, 2020, 31(3);257-266. doi: 10.11898/1001-7313.20200301 [9] 孙全德, 焦瑞莉, 夏江江, 等. 基于机器学习的数值天气预报风速订正研究. 气象, 2019, 45(3): 426-436. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201903012.htmSun Q D, Jiao R L, Xia J J, et al. Adjusting wind speed prediction of numerical weather forecast model based on machine learning methods. Meteor Mon, 2019, 45(3): 426-436. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201903012.htm [10] Han L, Sun J, Zhang W, et al. A machine learning nowcasting method based on real-time reanalysis data. J Geophys Res Atmos, 2017, 122(7): 4038-4051. doi: 10.1002/2016JD025783 [11] Han L, Chen M X, Chen K K, et al. A deep learning method for bias correction of ECMWF 24-240 h forecasts. Adv Atmos Sci, 2021, 38: 1444-1459. doi: 10.1007/s00376-021-0215-y [12] 杨璐, 韩丰, 陈明轩, 等. 基于支持向量机的雷暴大风识别方法. 应用气象学报, 2018, 29(6): 42-51. doi: 10.11898/1001-7313.20180604Yang L, Han F, Chen M X. Thunderstorm gale identification method based on support vector machine. J Appl Meteor Sci, 2018, 29(6): 42-51. doi: 10.11898/1001-7313.20180604 [13] 谭江红, 陈伟亮, 王珊珊. 一种机器学习方法在湖北定时气温预报中的应用试验. 气象科技进展, 2018, 8(5): 46-50. doi: 10.3969/j.issn.2095-1973.2018.05.006Tan J H, Chen W L, Wang S S. Using a machine learning method for temperature forecast in Hubei Province. Adv Meteor Sci Tech, 2018, 8(5): 46-50. doi: 10.3969/j.issn.2095-1973.2018.05.006 [14] 疏杏胜, 王子茹, 李福威, 等. 基于机器学习模型的短期降雨多模式集成预报. 南水北调与水利科技, 2020, 18(1): 42-50. https://www.cnki.com.cn/Article/CJFDTOTAL-NSBD202001009.htmShu X S, Wang Z R, Li F W, et al. Short-term rainfall multi-mode integrated forecasting based on machine leaning models. South-to-North Water Transfers and Water Science & Technology, 2020, 18(1): 42-50. https://www.cnki.com.cn/Article/CJFDTOTAL-NSBD202001009.htm [15] Burke A. Calibration of machine learning-based probabilistic hail predictions for operational forecasting. Bull Amer Meteor Soc, 2020, 35: 149-168. [16] Chen T, Guestrin C. XGBoost: A Scalabletree Boosting System. 2016. [17] 韩丰, 杨璐, 周楚炫, 等. 基于探空数据集成学习的短时强降水预报试验. 应用气象学报, 2021, 32(2): 188-199. doi: 10.11898/1001-7313.20210205Han F, Yang L, Zhou C X, et al. An experimental study of the short-time heavy rainfall event forecast based on ensemble learning and sounding data. J Appl Meteor Sci, 2021, 32(2): 188-199. doi: 10.11898/1001-7313.20210205 [18] 毛开银, 赵长名, 何嘉. 基于XGBoost的10 m风速订正研究. 成都信息工程大学学报, 2020, 35(6): 604-609. https://www.cnki.com.cn/Article/CJFDTOTAL-CDQX202006004.htmMao K Y, Zhao C M, He J. A research for 10 m wind speed prediction based on XGBoost. Journal of Chengdu University of Information Technology, 2020, 35(6): 604-609. https://www.cnki.com.cn/Article/CJFDTOTAL-CDQX202006004.htm [19] 任萍, 陈明轩, 曹伟华, 等. 基于机器学习的复杂地形下短期数值天气预报误差分析与订正. 气象学报, 2020, 78(6): 1002-1020. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB202006009.htmRen P, Chen M X, Cao W H, et al. Error analysis and correction of short-term numerical weather prediction under complex terrain based on machine learning. Acta Meteor Sinica, 2020, 78(6): 1002-1020. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB202006009.htm [20] 徐磊, 王甜莉, 刘松国, 等. 基于SSA-XGBoost方法的降水变化趋势预测模型. 地球环境学报, 2020, 11(5): 475-485. https://www.cnki.com.cn/Article/CJFDTOTAL-DQHJ202005002.htmXu L, Wang T L, Liu S G, et al. Model based on SSA-XGBoost method for predicting precipitation change trends. Journal of Earth Environment, 2020, 11(5): 475-485. https://www.cnki.com.cn/Article/CJFDTOTAL-DQHJ202005002.htm [21] 陈明轩, 高峰, 孔荣, 等. 自动临近预报系统及其在北京奥运期间的应用. 应用气象学报, 2010, 21(4): 395-404. doi: 10.3969/j.issn.1001-7313.2010.04.002Chen M X, Gao F, Kong R, et al. Introduction of auto-nowcasting system for convective storm and its performance in Beijing Olympics meteorological service. J Appl Meteor Sci, 2010, 21(4): 395-404. doi: 10.3969/j.issn.1001-7313.2010.04.002 [22] 陈明轩, 王迎春, 俞小鼎. 交叉相关外推算法的改进及其在对流临近预报中的应用. 应用气象学报, 2007, 18(5): 690-701. doi: 10.3969/j.issn.1001-7313.2007.05.014Chen M X, Wang Y C, Yu X D. Improvement and application test of TREC algorithm for convective storm nowcast. J Appl Meteor Sci, 2007, 18(5): 690-701. doi: 10.3969/j.issn.1001-7313.2007.05.014 [23] 杨璐, 陈敏, 陈明轩, 等. 高时空分辨率三维风场在强对流天气临近预报中的融合应用研究. 气象学报, 2019, 77(2): 243-255. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201902006.htmYang L, Chen M, Chen M X, et al. Fusion of 3D high temporal and spatial resolution wind field and its application in nowcasting of severe convective weather. Acta Meteor Sinica, 2019, 77(2): 243-255. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201902006.htm [24] 宋林烨, 陈明轩, 程丛兰, 等. 京津冀夏季雷达定量降水估测的误差统计及定量气候校准. 气象学报, 2019, 77(3): 497-515. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201903009.htmSong L Y, Chen M X, Cheng C L, et al. Characteristics of summer QPE error and a climatological correction method over Beijing-Tianjin-Hebei region. Acta Meteor Sinica, 2019, 77(3): 497-515. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201903009.htm [25] 程丛兰, 陈敏, 陈明轩, 等. 临近预报的两种高时空分辨率定量降水预报融合算法的对比试验. 气象学报, 2019, 77(4): 701-714. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201904008.htmCheng C L, Chen M, Chen M X, et al. Comparative experiments on two high spatiotemporal resolution blending algorithms for quantitative precipitation nowcasting. Acta Meteor Sinica, 2019, 77(4): 701-714. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201904008.htm [26] Fushiki T. Estimation of prediction error by using K-fold cross-validation. Statistics & Computing, 2011, 21(2): 137-146. [27] Dasarathy B V, Sheela B V. A composite classifier system design: Concepts and methodology. Proceeding of the IEEE, 1979, 67(5): 708-713. doi: 10.1109/PROC.1979.11321 [28] Wolpert D H. Stacked generalization. Pergamon, 1992, 5(2): 241-259. [29] 丁岚, 骆品亮. 基于Stacking集成策略的P2P网贷违约风险预警研究. 投资研究, 2017(4): 41-54. https://www.cnki.com.cn/Article/CJFDTOTAL-TZYJ201704004.htmDing L, Luo P L. Research on default risk early warning of P2P online loan based on Stacking integration strategy. Review of Investment Studies, 2017(4): 41-54. https://www.cnki.com.cn/Article/CJFDTOTAL-TZYJ201704004.htm [30] 苏刚, 秦胜伍, 乔双双, 等. 基于Stacking集成学习的泥石流易发性评价: 以四川省雅江县为例. 世界地质, 2021, 40(1): 10. https://www.cnki.com.cn/Article/CJFDTOTAL-SJDZ202101022.htmSu G, Qin S W, Qiao S S, et al. Debris flow susceptibility evaluation based on Stacking ensemble learning: A case study in Yajiang, Sichuan Province. Global Geology, 2021, 40(1): 10. https://www.cnki.com.cn/Article/CJFDTOTAL-SJDZ202101022.htm [31] 黄秋丽, 黄柱兴, 杨燕. 基于递归特征消除和Stacking集成学习的股票预测实证研究. 南宁师范大学学报(自然科学版), 2021, 38(3): 37-43. https://www.cnki.com.cn/Article/CJFDTOTAL-GXSZ202103008.htmHuang Q L, Huang Z X, Yang Y. An empirical study of stock forecasting based on recursive characteristic elimination and Stacking integrated learning. Journal of Nanning Normal University(Nat Sci Ed), 2021, 38(3): 37-43. https://www.cnki.com.cn/Article/CJFDTOTAL-GXSZ202103008.htm [32] 闵晶晶. BJ-RUC系统模式地面气象要素预报效果评估. 应用气象学报, 2014, 25(3): 265-273. doi: 10.3969/j.issn.1001-7313.2014.03.002Min J J. Evaluation on surface meteorological element forecast by Beijing rapid update cycle system. J Appl Meteor Sci, 2014, 25(3): 265-273. doi: 10.3969/j.issn.1001-7313.2014.03.002