期刊文献+

计算机文本信息挖掘技术在网络安全中的应用 被引量:6

Application of Computer Text Information Mining Technology in Network Security
下载PDF
导出
摘要 针对网络文本信息的安全性判别问题,采取改进的邻近分类算法挖掘文本.该改进邻近分类方法在传统方法定义分类特征的同时,起用共线性判别矩阵,对具有共线属性的特征合并处理.这种改进策略,不仅可以增加分类特征的准确性,也可以加快文本信息的分类进程.对Spambase语料库开展实验研究,从精度、召回率、联判度、误差4个维度对分类效果进行评价.结果显示:改进的邻近分类方法具有明显的优势,可以更加准确地区分安全文本和危险文本. In view of the security problem of network text information, we adopt an improved neighbor classification al- gorithm to carry out text mining. In improved nearest neighbor method, definition and classification are carried out by tra- ditional method, and characteristics are merged by reinstating co-linear discriminant matrix of collinear attribute features. This improved strategy not only increase the accuracy of classification features, but also speed up the classification process of text information. An experimental study is carried out on the Spambase corpus, and the classification results are evalu- ated from 4 dimensions. Namely accuracy, recall rate, the degree of error, and the error. Results show that the improved method has obvious advantages, and that is more accurate in the area of security text and dangerous text.
作者 韩文智
出处 《华侨大学学报(自然科学版)》 CAS 北大核心 2016年第1期67-70,共4页 Journal of Huaqiao University(Natural Science)
基金 四川省自然科学基金重点资助项目(15ZA0349)
关键词 文本信息 文本挖掘 文本分类 邻近分类 text information text mining text classification neighbor classification
  • 相关文献

参考文献10

  • 1DAVIES S, MOORE A. Bayesian networks for lossless dataset compression[C]//Proceeding of International Con- ference Knowledge Discovery and Data Mining. San Diego: ACM Press, 2013:387-391.
  • 2喻小光,陈维斌,陈荣鑫.一种数据规约的近似挖掘方法的实现[J].华侨大学学报(自然科学版),2008,29(3):370-374. 被引量:6
  • 3MERETAKIS D, WUTHRICH B. Extending naive bayes classifiers using long item sets[C]//Proceeding of Interna- tional Conference Knowledge Discovery and Data Mining. San Diego:ACM Press, 2013:165-174.
  • 4ESPOSITO F, MALERBA D, SEMERARO G, et al. A comparative analysis of methods for pruning decision trees [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014,19 (5): 476-491.
  • 5LAM S L Y, LEE D L. Feature reduction for neural network based text categorization[C]//Digital Symposium Col- lection of 6th International Conference on Database System for Advanced Application. ES. 1. -: IEEE Press, 2015: 1121-1130.
  • 6CESTNIK B, BRATKO I. On estimating probabilities in tree pruning, machine learning.. EWSL-91 [C]//Kodratoff Lecture Notes in Artificial Intelligence. Berlin.. Springer, 2015 : 138-150.
  • 7ANDROUTSOPOULOS G, PALIOURAS V, KARKALETSIS G, et al. Learning to filter spare e-mail.. A compari- son of a naive Bayesian and a memory based approach[C] // Proceedings of 4th European Conference on Principles and Practice of Knowledge Discovery in Databases. London=Jerry Press, 2000: 1-13.
  • 8孙丽华,张积东,李静梅.一种改进的kNN方法及其在文本分类中的应用[J].应用科技,2002,29(2):25-27. 被引量:36
  • 9寸待杰,刘韶涛.采用内容挖掘的缅甸文字相似文档检索[J].华侨大学学报(自然科学版),2013,34(5):521-524. 被引量:2
  • 10RASTOGI R, SHIM K. Public: A decision tree that integrates building and pruning[C]//Proceeding of 24th Inter- national Conference on Very Large Data Bases. New York: [s. n. ],2014:404-415.

二级参考文献23

  • 1王晓黎,王文杰.基于向量空间模型的文本检索系统[J].微电子学与计算机,2006,23(6):188-190. 被引量:18
  • 2徐云青,徐义峰,李舟军.基于VSM的中文信息检索[J].计算机系统应用,2007,16(4):21-23. 被引量:4
  • 3BIUM L, LANGLEY P. Selection of relevant features and examples in machine learning[J]. Artificial Intelligence, 1997,97(1-2) : 245-271.
  • 4DUNHAM H. Data mining course[M]. Beijing:Tsinghua University Press,2003.
  • 5HALL M A, HOLMES G. Bench marking attributes selection techniques for discrete class data mining[J]. IEEE Transactions on Knowledge and Data Engineering, 2003,15(3):1-16.
  • 6KOHAVI R, JOHN H. Wrappers for Feature Subset Selection[J]. Artificial Intelligence, 1997,97(1-2):273-324.
  • 7HALL A. Correlation-based feature selection for machine learning[D]. New Hamilton: University of Waikato, 1998.
  • 8DASH M, LIU H, MOTODA H. Consistency based feature selection[C]//Knowledge Discovery and Data Mining, Lecture Nates in Artificial Intelligevcl. Berlin.. Spring-Verlag, 2000:98-109.
  • 9HAND D, MANNILA H, SMYTH P. Data mining principle[M]. Beijing:Publishing House of Mechanics Industry, 2003.
  • 10GREINER R.Probabilistic hill-climbing: Theory and applications[C]//Proceedings of the Ninth Canadian Conference on Artificial Intelligence, 1992.

共引文献40

同被引文献39

引证文献6

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部