摘要
面对Internet上不断增长的巨大信息量,如何使用户获得有趣的和有用的信息已成为信息检索急需解决的问题。由于Web文档往往具有不确定的特征,使得利用模糊集合理论对信息检索过程的不确定性建立模型成为可能。文章提出了一种基于模糊相关技术的Web文档分类方法,实验结果表明,该方法比基于向量空间模型的Web分类方法有较高的分类精度。
Due to the explosive growth of available information on the WWW, it is not uncommon that the users on WWW often find themselves overwhelmed with the large amount of information that might be of their interest and usefulness. To alleviate this problem, there is a need for an intelligent tool to help the users screening and filtering for interesting and useful information. Web documents tend to have unpredictable characteristics. Motivated by these fuzzy characteristics, the fuzzy related technology in classifying Web documents into a predefined set of categories is adopted. The experimental results show that the approach yields higher classification accuracy compared to the vector space model.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2005年第24期13-14,17,共3页
Computer Engineering
基金
教育部重点资助项目
海南省自然科学基金资助项目
关键词
文本挖掘
文档分类
信息过滤
Text mining
Document classification
Information filtering