摘要
已有Web结构挖掘研究主要是挖掘出站点的核心节点而非整个结构。为此,设计Web站点逻辑域核及其导入路径的模型,提出Web站点逻辑域核挖掘算法和逻辑域核导入路径挖掘算法。在4个大型Web站点上的实验结果表明,Web站点逻辑域核挖掘算法和导入路径挖掘算法均能够达到较高的精度和召回率。
Existing researches on Web structure mining focus on finding the authoritative vertexes instead of the whole Web hyperlink structure. This paper designs the Website logical domain core and the model of domain core's entry path, proposes logical domain core mining algorithm and its entry path mining algorithm. Through experiment on four large Websites, results show that both algorithms can achieve relatively high precision and recall.
出处
《计算机工程》
CAS
CSCD
北大核心
2010年第21期57-58,61,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60702075)