摘要
从互联网中准确有效及时地自动搜索出需要的信息,是Web信息处理中的一个重要研究课题。本文在所提出的基于搜索路径Web网页搜索和基于多知识网页信息抽取方法基础上,给出基于Web企业竞争对手情报自动收集平台的实现方法,该平台可以有效地从多个企业门户网站中,自动搜索出所需要的目标网页,并能够从目标网页中自动抽取其中多记录信息。本文利用该平台进行了企业人才招聘信息的自动搜索实验。实验结果证实了该平台在信息自动搜集方面的有效性和准确性。
Web Information mining effectively, accurately and in time is an important research problem in Web Information process. This paper puts forward one solution of the Enterprise Competitor Intelligence Mining Platform based on methods of web page search using search path heuristic and web page information extraction using multiple record data representation heuristic. This Platform can search out target HTML pages from many enterprises' portals intelligently, and also can extract multiple data records from those target automatically without human interfere. The experiments made through the Enterprise Competitor Intelligence Mining Platform to mining Job posts information from Enterprise Portals have demonstrated this Platform has power competitive intelligence ability in web information mining.
出处
《微计算机应用》
2004年第1期1-7,共7页
Microcomputer Applications
基金
国家自然科学基金(60075015)