摘要
对于Web内容挖掘来说,对挖掘对象进行初步的识别是非常重要的,首先必须把含有具体内容的网页识别出来,才能进一步进行有效的分析。论文提出了链接比的概念,以此来分析网页的特征,然后进行有监督的学习,从而导出相关的规则,再用该规则对新的网页进行分类。
To Simply Classify the Web page is very important to Web Mining.Firstly,it should identify the Web page which content s the text message.Then it can analyse the page efficiently.This paper puts forward the concept of Link Ratio,and analyzes the character of Web page with it.By supervised learning,it can extract the rule of classification.Finally,the rule can be used to classify the new Web page.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第27期151-153,共3页
Computer Engineering and Applications