期刊文献+

基于LSA的历史工作票问题分类异常检测

Anomaly Classification Detection of Historical Trouble Tickets Based on LSA
下载PDF
导出
摘要 面向历史数据的问题分类异常检测是数据预处理过程中十分重要的环节,对后续工作中分类以及聚类的精度有着直接的影响,高质量的历史数据可显著提高分类及聚类效果。然而,工作票内容复杂、结构自由等特点使得传统的异常数据检测算法已不再适用。鉴于此,论文在基于距离的异常数据检测算法基础上,引入潜在语义分析(LSA)概念和信息熵(comentropy)概念,定义基于潜在语义分析和信息熵的异常度来度量工作票之间的异常程度,并提出基于潜在语义分析的工作票问题分类异常检测算法。理论分析及实验结果表明,改进的问题分类异常检测算法是有效可行的,且算法性能较优。 Anomaly classification detection based on historical data is a very important step in data preprocessing,which has a direct impact on the accuracy of classification and clustering in subsequent work. Historical data with high quality can significantly improve the result of classification and clustering. However,the complex content and free structure of trouble ticket make the traditional outlier detection algorithm no longer applicable. Therefore,after introducing the concept of latent semantic analysis and comentropy,outlier degree based on latent semantic analysis and comentropy is defined for measuring the outlier data. Furthermore,an algorithm for outlier mining based on latent semantic analysis is proposed based on traditional outlier detection algorithm. Finally,theoretical analysis experiment results show that the algorithm is efficient and feasible,and it has a better performance.
作者 张航 徐建 ZHANG Hang;XU Jian(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)
出处 《计算机与数字工程》 2018年第5期950-955,共6页 Computer & Digital Engineering
基金 国家自然科学基金项目(编号:61000053)资助
关键词 工作票 数据预处理 异常检测 潜在语义分析 信息熵 trouble ticket data preprocessing outlier detection latent semantic analysis comentropy
  • 相关文献

参考文献3

二级参考文献86

共引文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部