摘要
对不确定数据进行异常点检测能从不确定数据集中检测出与大部分对象不同的对象。用期望语义距离度量对象之间的距离,并提出加权期望语义距离计算方法,通过属性加权充分体现属性在期望语义距离度量中的贡献度不同,从而提高异常点检测结果的应用驱动性和有效性。算法在分类数据集中进行异常点检测,可以避免通常的异常点检测方法在检测时未考虑数据库中对象之间的差异性而导致检测结果的不准确。实验结果表明,分类数据中的加权期望语义距离异常点检测方法克服了传统距离度量在异常点检测算法中的缺陷,优化了算法的性能。
Outlier detection of uncertain data can detect objects that are different from most objects from an indeterminate data set.The distance between objects is measured by the expected semantic distance,and the weighted expectation semantic distance calculation method is proposed.The attribute weighting fully reflects the different contribution of the attribute in the expected semantic distance metric,thus improving the application-driven and effective detection of abnormal point detection results.The algorithm performs abnormal point detection in the classified data set,which can avoid the inaccuracy of the detection result when the normal abnormal point detection method does not consider the difference between the objects in the database.The experimental results show that the weighted expected semantic distance anomaly detection method in the classification data overcomes the shortcomings of the traditional distance metric in the anomaly detection algorithm and optimizes the performance of the algorithm.
作者
赵秦怡
黑韶敏
Zhao Qinyi;Hei Shaomin(College of Mathematics and Computer,Dali University,Dali,Yunnan 671003,China)
出处
《大理大学学报》
CAS
2019年第12期1-5,共5页
Journal of Dali University
关键词
异常点检测
加权期望语义距离
不确定数据
outlier detection
weighted expected semantic distance
uncertain data