Liu Na, Xiong Anyuan, Zhang Qiang, et al. Development of basic dataset of severe convective weather for artificial intelligence training. J Appl Meteor Sci, 2021, 32(5): 530-541. DOI:  10.11898/1001-7313.20210502.
Citation: Liu Na, Xiong Anyuan, Zhang Qiang, et al. Development of basic dataset of severe convective weather for artificial intelligence training. J Appl Meteor Sci, 2021, 32(5): 530-541. DOI:  10.11898/1001-7313.20210502.

Development of Basic Dataset of Severe Convective Weather for Artificial Intelligence Training

DOI: 10.11898/1001-7313.20210502
  • Received Date: 2021-05-11
  • Rev Recd Date: 2021-06-16
  • Publish Date: 2021-09-30
  • Deep learning shows great potential in severe convective weather nowcasting. The establishment of deep learning model is inseparable from a large number of training and learning, which is in terms of large capacity and high-quality dataset. Based on multi-source observations of CMA(China Meteorological Administration), disaster reports and internet media information, a dataset of severe convective weather for artificial intelligence training (SCWDS) is established. SCWDS is organized by severe convective weather events. It includes 184865 cases and each case is composed of several samples in the spatiotemporal window of the event. There are 9256405 samples including thunderstorm, gale, short-term heavy rain, hail and tornado in China from 2012 to 2019 in SCWDS. Each sample includes severe weather event annotation and corresponding spatiotemporal window of surface observations of temperature, precipitation, pressure, humidity, winds (average wind speed and maximum wind speed), radiosonde observations of temperature, dew point temperature, geopotential height and winds from 1000 to 1 hPa, lightning observations of intensity, radar volume scan data, visible, long wave infrared, water vapor and mid infrared channels of FY-2E, FY-2G and FY-2D nominal disk data, and environmental factors of ERA5 reanalysis data. Quality control and data cleaning are carried out, and all cases of time discontinuity, wrong logical relationship or caused by non-convective factors are eliminated. It shows that the thunderstorm, the short-term heavy rain and the hail mainly occur from April to September, especially from June to August in summer. However, the thunderstorm and the gale occur most frequently from April to May. The tornado occurs frequently from June to August and April. The thunderstorm, the gale and the hail show the same diurnal variation, and the high frequency period is concentrated between afternoon and evening. The daily cycle of the occurrence frequency of the short-term heavy rain presents a bimodal feature, and the high value period is in 0300-0400 BT and 1500-1600 BT. The occurrence of severe convective weather presents large spatial variability. The thunderstorm mainly distributes in South China, Jiangnan, the Tibet Plateau and the Yunnan-Guizhou Plateau where the frequency generally exceeds 40 times. The gale mainly distributes in the northern part of North China and Xinjiang, coastal areas in the south of the Yangtze with frequency of more than 10 times. The short-time heavy rain is mainly concentrated in southwest, South China, Jiangnan and Huanghuai Regions with frequency of more than 100 times. The hail is mainly distributed in the Tibet Plateau, the Yunnan-Guizhou Plateau and the northern part of North China where the frequency generally exceeds 6 times. The tornado mainly distributes in Jiangsu, Guangdong and Qiongzhou Straits.
  • Fig. 1  An example of spatial window definition and corresponding observation composition for a severe convective weather event

    (the blue circle for 200 km and the red circle for 500 km are spatial windows, the shaded denotes FY-2E long-wave infrared channel brightness temperature)

    Fig. 2  Number(a) and proportion(b) of eliminated gale caused by non-convective weather factors

    Fig. 3  Number(a) and proportion(b) of eliminated short-time heavy rain caused by typhoon

    Fig. 4  Case structure of severe convective weather training dataset for artificial intelligence

    Fig. 5  Annual frequency variation of severe convective weather events

    Fig. 6  Daily frequency variation of severe convective weather events

    Fig. 7  Frequency distribution of severe convective weather events

    Table  1  Definition of severe convective weather events

    强对流天气类型 强对流天气定义
    雷暴 积雨云云中、云间或云地之间产生的放电现象,表现为闪电并有雷声,有时亦可只闻雷声而不见闪电
    雷暴大风 受强对流云团影响,瞬时风风速达到或超过17.0 m·s-1并伴有雷电的大风天气[22]
    短时强降水 由对流性天气系统造成的短时强降水天气过程,该过程中至少存在1个连续60 min累积降水量不小于20 mm的时段,过程开始于第1个60 min累积降水量不小于20 mm时段的开始分钟,结束于最后1个60 min累积降水量不小于20 mm时段的结束分钟
    冰雹 直径不小于2 mm的坚硬球状、锥状或不规则形状的固态降水,常伴随雷暴出现
    龙卷 最为猛烈的对流天气现象之一,它是一种水平尺度很小但破坏力很大的小尺度天气系统,是和强对流云相伴出现的具有垂直轴的小范围强烈涡旋,上部是积状云,下部是下垂的漏斗状云柱,底部直径一般为几十米到数百米,不超过800 m,移动距离几百米到几千米,产生的最大地面风速可达140 m·s-1
    DownLoad: Download CSV

    Table  2  Temporal and spatial window definition of weather condition for severe convective weather events (negative represents hours before the event beginning,positive represents hours after the event ending)

    表征天气条件的数据种类 空间窗定义 时间窗定义
    地面观测数据 覆盖以天气过程发生地为中心200 km为半径的圆形范围的国家级台站 [-2 h,+2 h]
    探空数据 覆盖以天气过程发生地为中心500 km为半径的圆形范围的探空站 [-24 h,+2 h]
    闪电定位数据 覆盖以天气过程发生地为中心200 km为半径的圆形范围的闪电定位数据 [-2 h,+2 h]
    雷达基数据 覆盖以天气过程发生地为中心200 km为半径的圆形范围的多普勒天气雷达基数据 [-2 h,+2 h]
    卫星多通道数据 覆盖以天气过程发生地为中心1000 km为边长的正方形范围的静止卫星多通道数据 [-2 h,+2 h]
    再分析产品 中国范围的再分析产品 [-2 h,+2 h]
    DownLoad: Download CSV

    Table  3  Description of data cleaning methods for severe convective weather events

    数据清洗方法 数据清洗方法描述
    不完整数据清洗 ①时空属性缺失,无法通过统计方法进行补充,按缺失数据剔除处理;②物理强度属性缺失,按空间一致性统计方法,利用最邻近台站同一时间观测的强度属性值进行补充
    不一致数据清洗 针对由于观测时期、观测来源不同造成的数据量纲、数据格式等表达不一致的数据进行统一化处理
    不连续数据清洗 针对强对流天气过程持续时间过短和同一地点两次强对流天气过程时间间隔过短数据的清洗处理
    逻辑关系错误数据清洗 针对属性值违背业务规定逻辑关系的数据清洗
    非对流性天气过程清洗 针对非对流性因素影响的天气过程的数据清洗
    互联网数据验证 以2012—2019年《中国气象灾害年鉴》[35]、国家级地面气象观测站观测数据及中国气象局灾情直报系统多个来源数据为参考基准,若天气过程被上述数据源记录,则认为是真实记录
    DownLoad: Download CSV
  • [1]
    Tang W Y, Zhou Q L, Liu X H, et al. Anlyisis on verification of national severe convective weather categorical forecasts. Meteor Mon, 2017, 43(1): 67-76. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201701007.htm
    [2]
    Hitchens N M, Brooks H E. Evaluation of the storm prediction center's day 1 convective outlooks. Wea Forecasting, 2012, 27(6): 1580-1585. doi:  10.1175/WAF-D-12-00061.1
    [3]
    Gagne D J, McGovern A, Haupt S E, et al. Storm-based probabilistic hail forecasting with machine learning applied to convection-allowing ensembles. Wea Forecasting, 2017, 32(5): 1819-1840. doi:  10.1175/WAF-D-17-0010.1
    [4]
    Sun J, Cao Z, Li H, et al. Application of artificial intelligence technology to numerical weather prediction. J Appl Meteor Sci, 2021, 32(1): 1-11. doi:  10.11898/1001-7313.20210101
    [5]
    Perler D, Marchand O. A study in weather model output postprocessing: Using the boosting method for thunderstorm detection. Wea Forecasting, 2009, 24(1): 211-222. doi:  10.1175/2008WAF2007047.1
    [6]
    Lagerquist R, McGovern A, Smith T. Machine learning for real-time prediction of damaging straight-line convective wind. Wea Forecasting, 2017, 32(6): 2175-2193. doi:  10.1175/WAF-D-17-0038.1
    [7]
    Marzban C, Witt A. A bayesian neural network for severe-hail size prediction. Wea Forecasting, 2001, 16(5): 600-610. doi:  10.1175/1520-0434(2001)016<0600:ABNNFS>2.0.CO;2
    [8]
    Marzban C, Stumpf G J. A neural network for tornado prediction based on Doppler radar-derived attributes. J Appl Meteor, 1996, 35(5): 617-626. doi:  10.1175/1520-0450(1996)035<0617:ANNFTP>2.0.CO;2
    [9]
    Mecikalski, John R, Williams, et al. Probabilistic 0-1-h convective initiation nowcasts that combine geostationary satellite observations and numerical weather prediction model data. J Appl Meteorol Climatol, 2015, 54(5): 1039-1059. doi:  10.1175/JAMC-D-14-0129.1
    [10]
    Shi X J, Chen Z R, Wang H, et al. Convolutional LSTM network: A Machine Learning Approach for Precipitation Nowcasting//Proc 28th Int Conf on NIPS, 2015: 802-810.
    [11]
    Han F, Long M S, Li Y A, et al. The application of recurrent neural network to nowcasting. J Appl Meteor Sci, 2019, 30(1): 61-69. doi:  10.11898/1001-7313.20190106
    [12]
    Shi X J, Gao Z H, Lausenl L, et al. Deep Learning for Precipitation Nowcasting: A Benchmark and New Model//Proc 31st Conf on NIPS, 2017: 5617-5627.
    [13]
    Jing J, Li Q, Peng X. MLC-LSTM: Exploiting the spatiotemporal correlation between multi-level weather radar echoes for echo sequence extrapolation. Sensors, 2019, 19(18): 3988-4008. doi:  10.3390/s19183988
    [14]
    Zhou K H, Zheng Y, Li B, et al. Forecasting different types of convective weather: A deep learning approach. J Meteor Res, 2019, 33(5): 797-809. doi:  10.1007/s13351-019-8162-6
    [15]
    Su H, Deng J, Li F F. Crowdsourcing Annotations for Visual Object Detection. AAAI Human Computation Workshop, 2012. http://www.researchgate.net/publication/291249011_Crowdsourcing_annotations_for_visual_object_detection
    [16]
    Liu B J, Zhang Y P, Li Z J, et al. An objective hailstorm labeling algorithm based on ground observation. J Appl Meteor Sci, 2021, 32(1): 78-90. doi:  10.11898/1001-7313.20210107
    [17]
    Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis, 2015, 115(3): 211-252. doi:  10.1007/s11263-015-0816-y
    [18]
    Dai A, Chang A X, Savva M, et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes//IEEE Conf on CVPR, 2017: 2432-2443.
    [19]
    Hersbach H, Bell B, Berrisford P, et al. The ERA5 global reanalysis. Q J R Meteorol Soc, 2020, 146(730): 1999-2049. doi:  10.1002/qj.3803
    [20]
    Zheng Y G, Zhou K H, Sheng J, et al. Advances in techniques of monitoring, forecasting and warning of severe convective weather. J Appl Meteor Sci, 2015, 26(6): 641-657. doi:  10.11898/1001-7313.20150601
    [21]
    China Meteorological Administration. Specifications for Surface Meteorological Observation. Beijing: China Meteorological Press, 2003.
    [22]
    Wang H, Li Y, Song L L, et al. Comparison of characteristics and environmental factors of thunderstorm gales over the Sichuan-Tibet Region. J Appl Meteor Sci, 2020, 31(4): 435-446. doi:  10.11898/1001-7313.20200406
    [23]
    Wang B M. A study on synthetic differentiation method for basic meteorological data quality control. J Appl Meteor Sci, 2004, 15(Suppl I): 50-59. https://www.cnki.com.cn/Article/CJFDTOTAL-YYQX2004S1008.htm
    [24]
    Ren Z H, Xiong A Y, Zou F L. The quality control of surface monthly climate data in China. J Appl Meteor Sci, 2007, 18(4): 516-523. http://qikan.camscma.cn/article/id/20070481
    [25]
    Ren Z H, Zhang Z F, Sun C, et al. Development of three step quality control system of real time observation data from AWS in China. Meteor Mon, 2015, 41(10): 1268-1277. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXX201510010.htm
    [26]
    Wang H J, Liu Y. Comprehensive consistency method of data quality controlling with its application to daily temperature. J Appl Meteor Sci, 2012, 23(1): 69-76. http://qikan.camscma.cn/article/id/20120108
    [27]
    Zhou S H. Quality control and technical method for producing data set for upper-air data in China. J Appl Meteor Sci, 2000, 11(3): 364-370. http://qikan.camscma.cn/article/id/20000353
    [28]
    Ruan X, Xiong A Y, Hu K X, et al. Correcting geopotential height errors of some mandatory levels of Chinese historic radiosonde observations. J Appl Meteor Sci, 2015, 26(3): 257-267. doi:  10.11898/1001-7313.20150301
    [29]
    Wen H, Liu L P, Zhang C A. Operational evaluation of radar data quality control for ground clutter and electromagnetic interference. J Meteor Sci, 2016, 36(6): 789-799. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKX201606009.htm
    [30]
    Liu L P, Wu L L, Yang Y M. Development of fuzzy-logical two-step ground clutter detection algorithm. Acta Meteor Sinica, 2007, 65(2): 252-260. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB200702010.htm
    [31]
    Tan X, Liu L P, Fan S R. Statistical characteristics of sea clutter and identification of sea clutter with CINRAD. Acta Meteor Sinica, 2013, 71(5): 962-975. https://www.cnki.com.cn/Article/CJFDTOTAL-QXXB201305015.htm
    [32]
    Leng L, Huang X Y, Yang H P, et al. Recognition and application of doppler weather radar clear air echoes. Meteor Sci Technol, 2012, 40(4): 24-31. https://www.cnki.com.cn/Article/CJFDTOTAL-QXKJ201204005.htm
    [33]
    Xiao Y J, Wan Y F, Wang J, et al. Study of an automated Doppler radar velocity dealiasing algorithm. Plateau Meteor, 2012, 31(4): 1119-1128. https://www.cnki.com.cn/Article/CJFDTOTAL-GYQX201204028.htm
    [34]
    Jiang Y. Meteorological Radar Data Quality Control Study and Application. Beijing: Chinese Academy of Meteorological Sciences, 2013.
    [35]
    China Meteorological Administration. Yearbook of Meteorological Disasters in China(2013-2019). Beijing: China Meteorological Press, 2013-2019.
  • 加载中
  • -->

Catalog

    Figures(7)  / Tables(3)

    Article views (2090) PDF downloads(516) Cited by()
    • Received : 2021-05-11
    • Accepted : 2021-06-16
    • Published : 2021-09-30

    /

    DownLoad:  Full-Size Img  PowerPoint