摘要
人类社会现已进入了一个信息大爆发的新时代,如何利用计算机新技术从互联网上自动获取特定主题信息并实时提供服务,成为信息技术研究领域的热点之一。在网络爬虫、数据抽取、文本智能分类等关键技术研究及实现的基础上,研制集成了全球油气行业动态信息系统PetroDIS。该系统在信息获取、信息分类、网页构建等多方面做到了自动化,极大地提高了信息收集效率。
Human society has stepped into a new era of the proliferation of massive information. Automatic obtaining of information about particular subject and providing real-time services with new computer technologies have become a hot spot of information technology research. The development of PetroDIS is based on the research of key technologies including web crawler, data extraction, and intelligent text categorization. By enabling the automation in information acquisition, information classification, webpage construction and other aspects, the system greatly improves the efficiency of information collection.
出处
《信息技术》
2013年第12期23-26,共4页
Information Technology
基金
国家油气重大专项"全球剩余油气资源研究及油气资产快速评价技术(二期)"(2011ZX05028-004)
中国石油天然气股份有限公司重大专项"资源评价研究"(2012 E-050104)
关键词
网络爬虫
网页分析
智能分类
自适应神经网络
油气行业动态信息系统
Web crawler
webpage analysis
intelligent classification
adaptive neural network
petroleum dynamic information system