数据仓库技术在天气预报决策中的应用探讨

Data Warehouse and Its Potential in Weather Forecast

  • 摘要: 文章概述了数据仓库的概念和特点。讨论了数据仓库的数据存储、联机分析处理 (OLAP) 和数据挖掘 (DM) 要解决的主要技术问题, 侧重于数据仓库技术在天气预报领域中的应用。数据仓库技术将原始数据转换为便于分析的数据, 并增强了管理和使用历史数据及特种观测数据的能力, DM能够帮助预报员快速积累经验, OLAP使预报员的分析突破了过去固有框架的限制。文章针对天气预报决策特点提出以天气系统分析为主的数据聚集处理、在OLAP的多维分析之外增加比较分析、多元分析和相似分析功能等扩展, 还指出关联规则的挖掘是目前预报方法研究中值得尝试的新方法。

     

    Abstract: An important problem of current forecaster's forecasting platform is that although the system provides lot of data (over 2GB, several thousand weather fields data one day) forecasters only use a few of them (less than 1%) in operational forecast. And how to enable the system to have a flexible data management ability for forecasters to efficiently use historical data is another important issue. Data warehouse is a good solution to these problems. The data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process. The data in the warehouse are processed ones called "analytic data" correspond to original operational data:"Subjects" are defined as objects to be analyzed in weather forecast, e.g. those concepts in forecaster's experiences. "Analytic data" are referred to as the real values corresponding to the "subjects" transformed from original operational data according to the definitions of subjects (for weather forecast, transformation based on the weather system is of the most importance). By creating a subject system from the concept set of forecaster's knowledge, defining the data transformation to change operational data into analytic data for each subject in the subject system, running the transformation program on operational data every day to get real-time analytic data and save them in a database, a data warehouse is built. Data warehouse is a data set of "analytic data". In this way, from concept to subject to analytic data, the data used in analyses directly match the concepts in forecaster's mind, and make data analysis more quickly and use more data in operational forecast.There are two important analysis tools in data warehouse. Data Mining (DM) is an exploring tool. The relationships among the subjects (i.e., concepts from forecasters) are automatically explored from the analytic data set in DM system. The resulting relationship is saved in the knowledge base of data warehouse, and reinforce forecaster's knowledge. Mining of association rules is noteworthiness because sometimes it is more reasonable than linear regression analysis. On-line analysis process (OLAP) is another analysis tool, an interactive validating tool. Forecasters use it to view data, validating relationships (including forecaster's guess and results from DM) and then make forecast decisions. The kernel technology is multi-dimensional analysis. Especially for weather forecast, "Compare analysis", "Multi-analysis" and "Analog analysis" based on multi-dimensional analysis are also used in OLAP. OLAP will be the main workbench for forecasters in data warehouse.In data warehouse, metadata is used. Data management and maintenance become easier and flexible, historical data and heterogeneous data such as special observation data, even Internet data, will be easier to use by applications. The bottleneck of traditional knowledge base system is knowledge acquirement. In data warehouse, forecasters put their concepts into subject system of data warehouse firstly, then get relationships between concepts from DM (or manually input some certain relationships) and validate them by OLAP. The knowledge of forecasters will be systematically used in forecast process, and the bottleneck problem will be moderated.

     

/

返回文章
返回