历史纸质气象档案数字化技术策略初步分析
Preliminary Researches on Digitizing Paper Meteorological History Archives
-
摘要: 利用安全扫描和光学字符识别技术实现气象档案数字化是对纸质气象历史档案拯救和开发利用的有效途径。该文在对数字化技术进行调研和试验的基础上, 提出了历史纸质气象档案数字化的建设思路, 针对气象档案记录内容的特点, 对OCR (optical character recognition) 手写体数字识别技术应用进行了分析, 提出了气象档案OCR识别的解决策略, 为业界的纸质气象档案数字化建设提供一种技术思路和有效的技术参考。Abstract: Realizing the digitization of the meteorological archives with the secure scanning and the OCR (optical character recognition) is an effective way to save and develop the papery historic meteorological archives. Based on the investigation and the experimentation on the digitization technique, a conception about the digitization of the papery historic meteorological archives is presented. Aiming at the character of the content recorded in the meteorological archives, the application of the handwritten numeral recognition of OCR is analyzed, and the resolution strategy of OCR recognition of the meteorological archives is presented, which provides a technical idea and effective technical reference for the construction of the digitization of the papery meteorological archives in the area.The scientific and effective way of the meteorological archive digitization is to utilize the technique of the integrated carrier and the OCR which are advanced at home and abroad, and to found a system platform of the digitization of the meteorological data. The platform is consisted of the high/low secure scanner, personal computer, the storage device, the managing software, the OCR software and the applied software etc. The system integrates the scanning process, the quality control, and the statistical process of the various papery archives and the microfilm archives, creates the electronic files in the unified format and the same medium, realizes the extraction of the data information from the meteorological data in the form of long sequence with OCR technique, and finally solves the problem of protection and the digitization of the library papery meteorological data.The construction of the meteorological archive digitization is not only simple data processing, but also relates to a series of associated techniques including the classification of the meteorological archives, the construction of the standard specification, the secure scanning, OCR technique, the data storage, the construction of the data set and the retrieval and application of the information etc. The construction of the meteorological archive digitization integrates the archives on the different carrier in order to realize the overall application of the protection and digitization of the archives. The resulting electronic documents and the long term digitizing documents have important significance for the protection of the archives and the climatic analysis in the various fields.The primary analysis of the strategy of the digitization of the papery meteorological archives indicates that it is feasible to apply the secure scanning and the OCR technique to the digitization of the papery meteorological archives. At present, the system of the meteorological archive digitization has stepped into the performing phase, and the digitization of the papery historic meteorological archives will provide the foundation for the conservation and the application of the historic meteorological data.
-
表 1 应用Uniwex表单识别软件进行气象资料识别试验结果
-
[1] 段荣婷.我国数字档案馆的研究与建设.中国档案, 2002, (6): 24-26. http://www.cnki.com.cn/Article/CJFDTOTAL-ZGDA200206018.htm [2] 中国气象局气象档案馆. 中国气象局气象档案馆指南. 北京: 气象出版社, 2003: 23-24. [3] 杨公之.档案信息化建设导论.北京:中国档案出版社, 2001: 75-79. [4] 王伯民, 吕勇平, 张强.降水自记纸彩色扫描数字化处理系统.应用气象学报, 2004, 15 (6): 737-744. http://qikan.camscma.cn/jams/ch/reader/view_abstract.aspx?file_no=20040691&flag=1 [5] 林晓帆, 丁晓青, 吴佑寿.手写数字识别的原理及应用.档案学研究, 2004, (2): 11-13. http://www.cnki.com.cn/Article/CJFDTOTAL-JSJS200703006.htm [6] 刘志磊, 严继东. 论地 (市) 级图书馆建设数字图书馆的策略. 第二届海峡两岸公共图书馆基础建设研讨会, 北京, 2001.
表(1)
计量
- 摘要浏览量: 3161
- HTML全文浏览量: 581
- PDF下载量: 1782
- 被引次数: 0