期刊文献+

网络信息审计系统中的实时网页分类方法

Categorization Method of Real-Time Web Page in Network Information Audit Systems
下载PDF
导出
摘要 为了实现网络信息审计系统中的实时网页分类,提出了一种基于Dempster-Shafer证据理论的分类新方法.其基本思路是:不进行IP分片包重组,直接将网页地址特征和分片数据包作为分类的证据,计算各个证据对类的支持度,然后利用Dempster证据组合规则将各种证据提供的信息进行不断地在线融合判决,并最终给出网页的分类结果.当已有证据可以对网页进行有效分类时,对后续数据包不再做进一步处理.实验结果表明,所提方法的查准率大于83%,查全率大于90%,在分类性能和运行时间上均优于已有的基于分片的模糊K最近邻分类算法. To accomplish the categorization of real-time web page in network information audit systems, a new method based on Dempster-Shafer evidence theory is proposed. The main idea of it is as follows: the web page address and fragments are regarded as the evidence of categorization without reassembling IP fragments, and then the support degree that each of evidence stands up for a category is calculated. The Dempster combination rule is used to continuously fuse and adjudge the information provided by various evidences online, and finally the categorized result is obtained. When the existing evidences can efficiently categorize the web page, the subsequence fragments are no need to be handled further. The experiment shows that the precision rate and recall rate of the proposed method are larger than 83% and 90% respectively, and it is superior to the fuzzy K nearest neighbor algorithm based on fragments in the categorization performance and running time.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2006年第12期1393-1396,共4页 Journal of Xi'an Jiaotong University
基金 国家高技术研究发展计划资助项目(2003AA148010) 国家火炬计划资助项目(2005EB011484)
关键词 网络信息审计 网页分类 证据理论 network information audit web page categorization evidence theory
  • 相关文献

参考文献7

  • 1Yang Yiming,Liu Xin.A reexamination of text categorization methods[C]//Proceeding of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,1999:42-49.
  • 2李金库,张德运,高鹏,孙钦东.网络信息审计系统中的文本片断模糊分类算法[J].西安交通大学学报,2005,39(8):800-803. 被引量:2
  • 3Klein L A.多传感器数据融合理论及应用[M].戴亚平,刘征,郁光辉,译.2版.北京:北京理工大学出版社,2004:74-84.
  • 4Salton G,Buckley C.Term-weighting approaches in automatic text retrieval[J].Information Processing and Management,1998,24(5):513-523.
  • 5Yang Yiming,Pedersen J O.A comparative study on feature selection in text categorization[C] // Proceedings of 14th International Conference on Machine Learning.Nashville,USA:Morgan Kaufmann Publishers,1997:412-420.
  • 6Denoeux T.A k-nearest neighbor classification rule based on Dempster-Shafer theory[J].IEEE Transactions on System,Man and Cybernetics,1995,25(5):804-813.
  • 7西安交通大学捷普网络科技有限公司.Jump信息审计系统[EB/OL].[2005-12-15].http://www.jump.net.cn/aqcp_xxsj.asp.

二级参考文献7

  • 1Apte C, Damerau F, Weiss S. Automated learning of decision rules for text categorization [J]. ACM Transactions on Information System, 1994,12 (3): 233 - 251.
  • 2Yang Y. Expert network: effective and efficient learning from human decisions in text categorization and retrieval [A]. Seventeenth International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, 1994.
  • 3Lewis D D, Schapore R E, Callan J P, et al. Training algorithms for linear text classifiers [A]. Nineteenth International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland,1996.
  • 4Cohen W W, Singer Y. Context-sensitive learning methods for text categorization [A]. Nineteenth International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland,1996.
  • 5Cover T M, Hart P E. Nearest neighbor pattern classification [J]. IEEE Transactions on Information Theory,1967, 3(1): 21-27.
  • 6李晓黎,刘继敏,史忠植.基于支持向量机与无监督聚类相结合的中文网页分类器[J].计算机学报,2001,24(1):62-68. 被引量:108
  • 7解冲锋,李 星.基于序列的文本自动分类算法[J].软件学报,2002,13(4):783-789. 被引量:35

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部