Lu Yinghua, Ma Tinghuai, Cao Hao, et al. Adaptive optimization in small size file transmission of massive meteorological data. J Appl Meteor Sci, 2014, 25(5): 629-637.
Citation: Lu Yinghua, Ma Tinghuai, Cao Hao, et al. Adaptive optimization in small size file transmission of massive meteorological data. J Appl Meteor Sci, 2014, 25(5): 629-637.

Adaptive Optimization in Small Size File Transmission of Massive Meteorological Data

  • Received Date: 2013-11-07
  • Rev Recd Date: 2014-05-26
  • Publish Date: 2014-09-30
  • The data transfer and service architecture constructed by National Meteorological Information Center is the fundament for most meteorological data transmission. How to improve the timeliness of transmission of various data is a hot topic to enhance capabilities of meteorological services.According to requirements of transmission performance of massive small files, transmission parameters are optimized. And a self-adapting data transmission method is proposed based on real-time network status, which emphasizes network transmission protocol and file compression. Compression parameters and network transmission parameters are adjusted in real-time operation.Meteorological data include a great amount of heterogeneous small files, therefore compressing small files into a big file when being transformed will effectively reduce I/O accesses. First, 50 KB is defined as the threshold for small meteorological data files through experiments. Then, by analyzing the file transfer time, the appropriate file amount in compressed packages is calculated to achieve the best transmission efficiency. Finally, considering the variability of network conditions and real-time network conditions, a self-adapting compression methods based on real-network is designed by means of real-time adjusting the compression level. This entire compression process is controlled by setting various parameters of lzop commands on the basis of the lzop algorithm library and the LZO algorithm. To achieve the goal of adjusting compression levels according to real-time network conditions, RTT (round trip time) is taken advantage of judging the current state of the network congestion. By comparing current RTT and previous RTT, changing the compression level or not is decided.In network transmission optimization, conclusions are made that TCP buffer and parallel transmission will consume memory resources according to experiments in Globus platform. At the same time, more parallel streams and larger size of TCP buffers will result in network congestion. Then, the self-adapting adjustment algorithm of TCP buffer size and the concurrent connection number algorithm of TCP based on real-network parameters are designed. Finally, the entire transmission framework of massive small files is designed by combining self-adapting compression method and transmission parameters optimization. Complete experiments are carried out based on the integration of self-adapting algorithm, showing that proposed optimization methods can improve the transmission performance sharply.
  • Fig. 1  Different transmission performances with different file sizes

    Fig. 2  Flow chart of adaptive compression algorithm

    Fig. 3  Flow chart of adaptive TCP buffer size adjustment

    Fig. 4  Result of adaptive buffer size adjustment

    Fig. 5  Flow chart of combined adaptive transmission algorithm

  • [1]
    李集明, 熊安元.气象科学数据共享系统研究综述.应用气象学报, 2004, 15(增刊Ⅰ):1-9. http://www.cnki.com.cn/Article/CJFDTOTAL-YYQX2004S1001.htm
    [2]
    高梅, 接连淑, 张文华.气象科研数据共享系统建设.应用气象学报, 2004, 15(增刊Ⅰ):17-25. http://www.cnki.com.cn/Article/CJFDTOTAL-YYQX2004S1003.htm
    [3]
    [4]
    邓莉, 王国复, 孙超, 等.基本气象资料共享系统建设.应用气象学报, 2004, 15(增刊Ⅰ):33-38. http://www.cnki.com.cn/Article/CJFDTOTAL-YYQX2004S1005.htm
    [5]
    祝婷, 李湘.WMO信息系统中气象元数据的设计与实现.应用气象学报, 2012, 23(2):238-244. doi:  10.11898/1001-7313.20120213
    [6]
    Ma Tinghuai, Ge Jian, Cao Hao, et al.Design and Implementation of Virtual Resources Management in Meteorology Grid. 9th International Conference on Grid and Cooperative Computing (GCC), 2010:58-63.
    [7]
    高峰, 王国复, 喻雯, 等.气象数据文件快速下载服务系统的设计与实现.应用气象学报, 2010, 21(2):243-249. doi:  10.11898/1001-7313.20100215
    [8]
    周铮嵘, 王铮, 何文春.分布式气象元数据同步系统的探索研究.应用气象学报, 2010, 21(1):121-128. doi:  10.11898/1001-7313.20100117
    [9]
    Allcock W.GridFTP:Protocol Extensions to FTP for the Grid//Global Grid Forum.2003.
    [10]
    Tian Y, Yu W K, Vetter J S.RXIO:Design and implementation of high performance RDMA-capable GridFTP.Computers & Electrical Engineering, 2012, 38(3):772-784.
    [11]
    Alberto S, María S P, Pierre G, et al.A Parallel Data Storage Interface to Gridftp//Robert M, Zahir T.OTM Conferences (2), Lecture Notes in Computer Science.2006:1203-1212.
    [12]
    Takeshi I, Hiroyuki O, Makoto I.Automatic Parameter Configuration Mechanism for Data Transfer Protocol GridFTP.2006 International Symposium on Applications and the Internet (SAINT'06), 2006:32-38.
    [13]
    Thulasidasan S, Feng W, Gardner M K.Optimizing GridFTP Through Dynamic Right-sizing.Proceedings of IEEE International Symposium on High Performance Distributed Computing, 2003. http://dl.acm.org/citation.cfm?id=822087.823405
    [14]
    Ma Teng, Luo Junzhou.Optimizing Large File Transfer on Data Grid.Lecture Notes in Computer Science, 2005:455-460. doi:  10.1007/11590354_57
    [15]
    Ito T, Ohsaki H, Imase M.On Parameter Tuning of Data Transfer Protocol GridFTP in Wide-area Grid Computing.Proceedings of Second International Workshop on Networks for Grid Applications (GridNets 2005), 2005:415-421. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.73.9187
    [16]
    Ito T, Ohsakih I.GridFTP-APT:Automatic Parallelism Tuning Mechanism for Data Transfer Protocol GridFTP.Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid, 2006:454-461. http://ieeexplore.ieee.org/document/1630857/
    [17]
    Patrick M, Ezra K, Martin S, et al.MNEMONIC:A Network Environment for Automatic Optimization and Tuning of Data Movement over Advanced Networks.Proceedings of 18th International Conference on Computer Communications and Networks, 2009:1-7.
    [18]
    Chen X, Jukan A.Optimized Parallel Transmission in OTN/WDM Networks to Support High-Speed Ethernet with Multiple Lane Distribution (MLD).IEEE/OSA Journal of Optical Communications and Networking (JOCN), 2012:248-258. http://ieeexplore.ieee.org/document/6171949/?arnumber=6171949&filter%3DAND(p_IS_Number:6171936)
    [19]
    Yildirim E, Yin D P, Kosar T.Prediction of optimal parallelism level in wide area Data transfers.IEEE Trans Parallel Distrib Syst, 2011, 22(12):2033-2045. doi:  10.1109/TPDS.2011.228
    [20]
    Yin D P, Yildirim E, Kulasekaran S, et al.A data throughput prediction and optimization service for widely distributed many-task computing.IEEE Trans Parallel Distrib Syst, 2011, 22(6):899-909. doi:  10.1109/TPDS.2010.187
    [21]
    [22]
    Bresnahan J, Link M, Kettimuthu R, et al.GridFTP Pipelining.Teragrid Conference, 2007:1-6.
  • 加载中
  • -->

Catalog

    Figures(5)

    Article views (3097) PDF downloads(1327) Cited by()
    • Received : 2013-11-07
    • Accepted : 2014-05-26
    • Published : 2014-09-30

    /

    DownLoad:  Full-Size Img  PowerPoint