Abstract:
To improve the accuracy of numerical weather prediction (NWP) and its ability for extreme weather event forecast, a hybrid model based on ensemble learning is proposed and tested by post-processing one of the most successfully predicted variables, temperature at 2 m height. The NWP dataset used is provided by The International Grand Global Ensemble (TIGGE) project in the European Centre from Medium-Range Weather Forecasts (ECMWF), with a horizontal resolution of 0.125°×0.125° and lead times from 6 to 168 h (with a 6 h increment, 28 lead times totally). The observation is collected from 301 stations covering China expect for Xizang and Qinghai, including 4 variables, temperature, pressure, relative humidity and wind speed every 3 hours. The ECMWF product and observation span a period of 6 years ranging from 1 January 2013 to 31 December 2018. Data from 2013 to 2017 are used for machine learning and model training, and data in 2018 are used for testing. The hybrid model named ALS consists of 2 stages. Stage 1 trains two separate models, a long short-term memory combined with a fully connected neural network (LSTM-FCN) and an artificial neural network (ANN). Stage 2 blends the output of LSTM-FCN and ANN using a linear regression (LR) model. The correction result is the output of LR. ALS model is then applied to correct the station temperature forecast with lead time from 6 to 168 h. Outcomes are verified by observations from stations, while LR model is used as control model. ALS model reduces the average root mean square error (RMSE) of the station temperature forecast by 0.61℃ (19.6%), and by 0.23℃ (8.4%) compared with the LR model. ALS model reduces RMSE at more stations compared with LR model (252 vs. 186). ALS model is particularly effective in areas where the accuracy of station temperature forecast is low, such as Guizhou and Yunnan. Forecasts for stations in these areas are significantly improved with an average RMSE reduction over 40%. Moreover, case analysis of high temperature show that ALS model improves the forecast accuracy of high temperature events significantly, with a RMSE reduction of 30.5% at 4 stations compared to station temperature forecast. It demonstrates that ensemble learning can be used to supplement weather forecast.