基于加权期望语义距离的不确定分类数据异常点检测

Weighted Expected Semantic Distance Based on Outlier Detection of Uncertain Classification Data

下载PDF

导出

摘要对不确定数据进行异常点检测能从不确定数据集中检测出与大部分对象不同的对象。用期望语义距离度量对象之间的距离,并提出加权期望语义距离计算方法,通过属性加权充分体现属性在期望语义距离度量中的贡献度不同,从而提高异常点检测结果的应用驱动性和有效性。算法在分类数据集中进行异常点检测,可以避免通常的异常点检测方法在检测时未考虑数据库中对象之间的差异性而导致检测结果的不准确。实验结果表明,分类数据中的加权期望语义距离异常点检测方法克服了传统距离度量在异常点检测算法中的缺陷,优化了算法的性能。 Outlier detection of uncertain data can detect objects that are different from most objects from an indeterminate data set.The distance between objects is measured by the expected semantic distance,and the weighted expectation semantic distance calculation method is proposed.The attribute weighting fully reflects the different contribution of the attribute in the expected semantic distance metric,thus improving the application-driven and effective detection of abnormal point detection results.The algorithm performs abnormal point detection in the classified data set,which can avoid the inaccuracy of the detection result when the normal abnormal point detection method does not consider the difference between the objects in the database.The experimental results show that the weighted expected semantic distance anomaly detection method in the classification data overcomes the shortcomings of the traditional distance metric in the anomaly detection algorithm and optimizes the performance of the algorithm.

作者赵秦怡黑韶敏 Zhao Qinyi;Hei Shaomin(College of Mathematics and Computer,Dali University,Dali,Yunnan 671003,China)

机构地区大理大学数学与计算机学院

出处《大理大学学报》 CAS 2019年第12期1-5,共5页 Journal of Dali University

关键词异常点检测加权期望语义距离不确定数据 outlier detection weighted expected semantic distance uncertain data

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献6

1于浩,王斌,肖刚,杨晓春.基于距离的不确定离群点检测[J].计算机研究与发展,2010,47(3):474-484. 被引量：19
2曹科研,栾方军,孙焕良,丁国辉.不确定数据基于密度的局部异常点检测[J].计算机学报,2017,40(10):2231-2244. 被引量：23
3洪沙,林佳丽,张月良.基于密度的不确定数据离群点检测研究[J].计算机科学,2015,42(5):230-233. 被引量：6
4高世健,王丽珍,肖清.一种基于U-AHC的不确定空间co-location模式挖掘算法[J].计算机研究与发展,2011,48(S3):60-66. 被引量：7
5肖清,陈红梅,王丽珍.基于DS理论的不确定空间co-location模式挖掘[J].云南大学学报（自然科学版）,2011,33(S2):182-187. 被引量：3
6赵秦怡,黑韶敏.基于期望语义距离的不确定k近邻分类方法[J].大理大学学报,2017,2(12):16-20. 被引量：1

二级参考文献60

1高世健,王丽珍,肖清.一种基于U-AHC的不确定空间co-location模式挖掘算法[J].计算机研究与发展,2011,48(S3):60-66. 被引量：7
2肖清,陈红梅,王丽珍.基于DS理论的不确定空间co-location模式挖掘[J].云南大学学报（自然科学版）,2011,33(S2):182-187. 被引量：3
3Knorr E M, Ng R T. Algorithms for mining distance based outliers in large datasets [C] //Proc of VLDB. San Francisco: Morgan Kaufmann, 1998:392-403.
4Knorr E M, Ng R T. Finding intensional knowledge of distance based outliers[C]//Proc of VLDB. San Francisco: Morgan Kaufmann, 1999:211-222.
5Barnett V, Lewis T. Outliers in Statistical Data [M]. New York: John Wiley and Sons, 1994.
6Breuning M M, Kriegel H P, Ng R T, et al. LOF: Identifying density-based local outliers [C]//Proc of ACM SIGMOD 2000. New York: ACM, 2000:93-104.
7Tukey J W. Exploratory Data Analysis [M]. Reading, MA: Addison-Wesley and Sons, 1994.
8Arning A, Agrawal R, Raghavan P. A linear method for deviation detection in large databases[C]// Proc of KDD. New York: ACM, 1996:164-169.
9Sarawagi S, Agrawal R, Megiddo N. Discovery-driven exploration of OLAP data cubes [C] //Proc of EDBT. Berlin: Springer, 1998:168-182.
10Jagadish H V, Koudas N, Muthukrishnan S. Mining deviants in a time series databases [C]//Proc of VLDB. San Francisco: Morgan Kanfmann, 1999:102-113.

共引文献51

1胡彩平,秦小麟.一种基于密度的局部离群点检测算法DLOF[J].计算机研究与发展,2010,47(12):2110-2116. 被引量：52
2黄添强,吴铁浩,袁秀娟,陈智文.利用模式噪声聚类分析的视频非同源篡改检测[J].计算机科学与探索,2011,5(10):914-920. 被引量：7
3胡云,李慧,施珺,蔡虹.基于属性约简和相对熵的离群点检测算法[J].山东大学学报（工学版）,2011,41(6):31-36.
4唐成龙,邢长征.基于数据分区和网格的离群点挖掘算法[J].计算机应用,2012,32(8):2193-2197. 被引量：2
5余宇峰,万定生,熊强.一种基于数量关联的孤立点检测算法[J].微电子学与计算机,2012,29(9):164-167.
6张强,王春霞,赵健,武龙举,李静永.基于聚类和局部信息的离群点检测算法[J].吉林大学学报（理学版）,2012,50(6):1214-1217. 被引量：1
7石岩,刘爱琴,张继福.一种基于基尼指标的高维数据离群挖掘算法[J].太原科技大学学报,2013,34(3):161-165. 被引量：3
8颜一鸣,郭鑫.一种新的不确定树模式聚类算法[J].计算机工程与科学,2013,35(7):156-163. 被引量：1
9雷晨曦,唐向红,李少波.断路器数据在线异常点检测算法研究[J].计算机应用研究,2014,31(6):1706-1709. 被引量：3
10吴德胜,管媛辉.移动互联网异常入侵行为下攻击意图预测仿真[J].计算机仿真,2018,35(12):241-244. 被引量：1

1何鎏一,杨国为.基于包裹学习算法在异常点检测上的研究[J].青岛大学学报（工程技术版）,2019,34(4):11-17.
2赵磊.多链路即时通信中交互数据异常点检测仿真[J].计算机仿真,2019,36(11):445-448. 被引量：3
3张敏,桂志鹏,成晓强,曹军,吴华意.一种WMS领域主题文本提取及元数据扩展方法[J].武汉大学学报（信息科学版）,2019,44(11):1730-1738. 被引量：6
4翟社平,李兆兆,段宏宇,李婧,董迪迪.基于词法、句法和语义的句子相似度计算方法[J].东南大学学报（自然科学版）,2019,49(6):1094-1100. 被引量：4
5孙丽萍.初中音乐教学如何利用问题驱动进行个性化教学设计[J].辽宁教育,2019,0(23):38-40. 被引量：2
6许飞翔,叶霞,李琳琳,曹军博,王馨.基于SA-BP算法的本体概念语义相似度综合计算[J].计算机科学,2020,47(1):199-204. 被引量：12
7杨梦玲.一种函数型模糊聚类算法及其应用[J].荆楚理工学院学报,2019,34(5):18-25.
8王明常,郭鑫,王凤艳,张馨月.基于FLUS的长春市土地利用动态变化与预测分析[J].吉林大学学报（地球科学版）,2019,49(6):1795-1804. 被引量：24
9张玉竹.项目化学习的设计与策略初探[J].辽宁教育,2019,0(24):5-8. 被引量：6
10林坤辉,陈雨人.基于语义分割与深度估计的行车环境实时解析[J].计算机测量与控制,2019,27(12):233-238. 被引量：1

大理大学学报

2019年第12期

浏览历史

内容加载中请稍等...

基于加权期望语义距离的不确定分类数据异常点检测

参考文献6

二级参考文献60

共引文献51

相关作者

相关机构

相关主题

浏览历史