摘要
介绍基于内容评价的、基于链接结构评价的和基于巩固学习的三种采集算法的优缺点;介绍一种依据词典构建主题Ontology的方法,该方法有助于提高Ontology的构建速度;最后,在分析传统采集算法的基础上,提出一种新的基于Ontology的面向主题的网页采集算法,并通过试验证明其优越性。
This paper summarizes the merits and flaws of the traditional approaches to topic specific web resource discovery, which include page content-based approach, page link-based approach and the approach of using reinforcement learning. In addition, the paper introduces a method of using a dictionary to build Ontology. which can reduce much time of users, On this basis, an Ontology-based approach to topic-specific web resource discovery is put forward, which shows great advantages through experiments.
出处
《图书情报工作》
CSSCI
北大核心
2006年第5期78-82,共5页
Library and Information Service
基金
浙江省2004年自然基金项目"面向电子商务的语义信息搜索与挖掘研究"(项目编号:M063149)的研究成果之一。