期刊文献+

基于加权期望语义距离的不确定分类数据异常点检测

Weighted Expected Semantic Distance Based on Outlier Detection of Uncertain Classification Data
下载PDF
导出
摘要 对不确定数据进行异常点检测能从不确定数据集中检测出与大部分对象不同的对象。用期望语义距离度量对象之间的距离,并提出加权期望语义距离计算方法,通过属性加权充分体现属性在期望语义距离度量中的贡献度不同,从而提高异常点检测结果的应用驱动性和有效性。算法在分类数据集中进行异常点检测,可以避免通常的异常点检测方法在检测时未考虑数据库中对象之间的差异性而导致检测结果的不准确。实验结果表明,分类数据中的加权期望语义距离异常点检测方法克服了传统距离度量在异常点检测算法中的缺陷,优化了算法的性能。 Outlier detection of uncertain data can detect objects that are different from most objects from an indeterminate data set.The distance between objects is measured by the expected semantic distance,and the weighted expectation semantic distance calculation method is proposed.The attribute weighting fully reflects the different contribution of the attribute in the expected semantic distance metric,thus improving the application-driven and effective detection of abnormal point detection results.The algorithm performs abnormal point detection in the classified data set,which can avoid the inaccuracy of the detection result when the normal abnormal point detection method does not consider the difference between the objects in the database.The experimental results show that the weighted expected semantic distance anomaly detection method in the classification data overcomes the shortcomings of the traditional distance metric in the anomaly detection algorithm and optimizes the performance of the algorithm.
作者 赵秦怡 黑韶敏 Zhao Qinyi;Hei Shaomin(College of Mathematics and Computer,Dali University,Dali,Yunnan 671003,China)
出处 《大理大学学报》 CAS 2019年第12期1-5,共5页 Journal of Dali University
关键词 异常点检测 加权期望语义距离 不确定数据 outlier detection weighted expected semantic distance uncertain data
  • 相关文献

参考文献6

二级参考文献60

  • 1高世健,王丽珍,肖清.一种基于U-AHC的不确定空间co-location模式挖掘算法[J].计算机研究与发展,2011,48(S3):60-66. 被引量:7
  • 2肖清,陈红梅,王丽珍.基于DS理论的不确定空间co-location模式挖掘[J].云南大学学报(自然科学版),2011,33(S2):182-187. 被引量:3
  • 3Knorr E M, Ng R T. Algorithms for mining distance based outliers in large datasets [C] //Proc of VLDB. San Francisco: Morgan Kaufmann, 1998:392-403.
  • 4Knorr E M, Ng R T. Finding intensional knowledge of distance based outliers[C]//Proc of VLDB. San Francisco: Morgan Kaufmann, 1999:211-222.
  • 5Barnett V, Lewis T. Outliers in Statistical Data [M]. New York: John Wiley and Sons, 1994.
  • 6Breuning M M, Kriegel H P, Ng R T, et al. LOF: Identifying density-based local outliers [C]//Proc of ACM SIGMOD 2000. New York: ACM, 2000:93-104.
  • 7Tukey J W. Exploratory Data Analysis [M]. Reading, MA: Addison-Wesley and Sons, 1994.
  • 8Arning A, Agrawal R, Raghavan P. A linear method for deviation detection in large databases[C]// Proc of KDD. New York: ACM, 1996:164-169.
  • 9Sarawagi S, Agrawal R, Megiddo N. Discovery-driven exploration of OLAP data cubes [C] //Proc of EDBT. Berlin: Springer, 1998:168-182.
  • 10Jagadish H V, Koudas N, Muthukrishnan S. Mining deviants in a time series databases [C]//Proc of VLDB. San Francisco: Morgan Kanfmann, 1999:102-113.

共引文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部