摘要
随着网络普及应用,Web内容安全问题已经引起人们的高度重视,对Web内容安全的分类监控已成为研究热点。在分析Web内容安全问题的基础上,提出一种"需求模型",该模型结合向量空间模型(VSM),利用Vague集改进的特征提取策略,扩充原有文档特征表示模型。通过对来自真实网络网页中等规模的语料实验证明,这种"需求模型"可以提高网页内容安全的文本分类效果,分类效果优于采用传统特征的方法。
With the popularity of Web applications, Web content security has been aroused great attention. Web content security monitoring and classifying have become a key research issue. Based on the analysis of Web content security, demanded model is proposed in this paper. The model, combined with VSM,employs the Vague set to improve the feature extraction strategy on the set of original document representation model. The results of the experiment on the medium-sized Web pages from the real Web corpus show that this model can improve the Web content security text classification results ,compared with the traditional one.
出处
《广西师范大学学报(自然科学版)》
CAS
北大核心
2010年第1期147-152,共6页
Journal of Guangxi Normal University:Natural Science Edition
基金
国家自然科学基金资助项目(60972139)
北京市自然科学基金资助项目(4062031
4092041)