2018, 29(5): 630-640.
DOI: 10.11898/1001-7313.20180511
Abstract:
Statistical products of surface meteorological data (SMD) are among the most-frequently-used data in meteorological research and operations. As the improvement of surface meteorological observation system over China, statistics of SMD have encountered problems such as large number of sites, wide variety of elements, and complexity of statistical strategy. With typical features of big data, it's possible for SMD to serve more precise and efficient operations nowadays, which is obviously beyond the capability of traditional serial processing framework.Aiming at precise and efficient statistic processing of data from more than 60000 surface weather stations, a statistical processing system for SMD is built based on big data technology. Compared to traditional serial processing framework, efficiency of the system has increased by more than 10 times and more statistics and function are provided, such as fast calculation, rolling update of statistical values according to late-arriving data and corrected information, and arbitrary time scale statistics. Storm distributed flow processing technology is applied in the system to realize efficient statistical calculations. Big data message transmission and cache technology are also applied to ensure the system's high efficiency and stability. Modular design framework ensures strong extensibility of the system, based on which statistics, quality control and evaluation algorithms are extended to varieties of data, e.g., upper-air, radiation, oceanic and aircraft measurements. The system is deployed at national meteorology department and its products are synchronously applied at the provincial level, for this layout ensures data consistency.The system is incorporated into China Integrated Meteorological Information Sharing System (CIMISS) and become its core data processing framework. The system provides more than 800 real-time multi-scale SMD statistical values to serve meteorological users and the public through CIMISS data unified service interface since January 2017. Based on data access logs, monthly access of daily SMD statistics reach 19.51 million times in 2017, ranking the 3th among over 400 data or products, playing important roles in weather monitoring, forecasting and warning, meteorological decision, public service and climate research.In the future, the technical framework and algorithm module of the system will be integrated into the processing pipeline of meteorological large data cloud platform, with further optimization of the computational topology for full use of computing resources, which can increase convergence time for distributed node processing results. To further improve the efficiency of statistical processing, the launching mechanism of this operation can be changed from periodic to automatic scheduling based on the trigger of observed data integrity.
Sun Chao, Huo Qing, Ren Zhihua, et al. Design and implementation of surface meteorological data statistical processing system. J Appl Meteor Sci, 2018, 29(5): 630-640. DOI: 10.11898/1001-7313.20180511