Adaptive Optimization in Small Size File Transmission of Massive Meteorological Data
 
                 
                
                    
                                        
                                                            
                    - 
Abstract
    The data transfer and service architecture constructed by National Meteorological Information Center is the fundament for most meteorological data transmission. How to improve the timeliness of transmission of various data is a hot topic to enhance capabilities of meteorological services.According to requirements of transmission performance of massive small files, transmission parameters are optimized. And a self-adapting data transmission method is proposed based on real-time network status, which emphasizes network transmission protocol and file compression. Compression parameters and network transmission parameters are adjusted in real-time operation.Meteorological data include a great amount of heterogeneous small files, therefore compressing small files into a big file when being transformed will effectively reduce I/O accesses. First, 50 KB is defined as the threshold for small meteorological data files through experiments. Then, by analyzing the file transfer time, the appropriate file amount in compressed packages is calculated to achieve the best transmission efficiency. Finally, considering the variability of network conditions and real-time network conditions, a self-adapting compression methods based on real-network is designed by means of real-time adjusting the compression level. This entire compression process is controlled by setting various parameters of lzop commands on the basis of the lzop algorithm library and the LZO algorithm. To achieve the goal of adjusting compression levels according to real-time network conditions, RTT (round trip time) is taken advantage of judging the current state of the network congestion. By comparing current RTT and previous RTT, changing the compression level or not is decided.In network transmission optimization, conclusions are made that TCP buffer and parallel transmission will consume memory resources according to experiments in Globus platform. At the same time, more parallel streams and larger size of TCP buffers will result in network congestion. Then, the self-adapting adjustment algorithm of TCP buffer size and the concurrent connection number algorithm of TCP based on real-network parameters are designed. Finally, the entire transmission framework of massive small files is designed by combining self-adapting compression method and transmission parameters optimization. Complete experiments are carried out based on the integration of self-adapting algorithm, showing that proposed optimization methods can improve the transmission performance sharply.
 
- 
                          
-