期刊文献+

双类型异质网中基于排序和聚类的离群点检测方法 被引量:6

An Outlier Detection Method Based on Ranking and Clustering in Bi-typed Heterogeneous Network
下载PDF
导出
摘要 挖掘隐藏在网络中不同于正常数据对象的离群点是数据挖掘的重要任务之一.目前,针对双类型异质信息网络离群点检测的研究工作相对较少,原本适用于同质网络的离群点检测方法将很难适用于双类型异质网络.为此,提出了异质信息网络中基于排序和聚类的离群点检测方法(RKBOutlier).从异质信息网络中抽取两种类型的对象以及链接两种对象的语义信息,将待检测的数据作为属性对象,将另一类型数据作为目标对象,对目标对象进行聚类来检测属性对象在各个聚类中的分布情况,数据分布异常的对象即为离群点.将排序和聚类相结合来显著提高聚类的准确度.实验结果表明,RKBOutlier可以在双类型异质信息网络中有效地检测出离群点. Mining the outliers that are different from normal data objects in the network is one of the important tasks in data mining. At present, the research aiming at outlier detection in bi-typed heterogeneous information network is relatively small. The methods which are applicable to homogeneous network can not be applied to bi-typed heterogeneous networks. Therefore, we propose a Rank-Kmeans Based Outlier detection method, called RKBOutlier, in heterogeneous information net- work. The two kinds of the objects and the connected semantic information are extracted from the heterogeneous information network. One type of the objects is regarded as the attribute objects, another type of the objects is regarded as the target ob- jects. We perform cluster partitioning on target objects to detect the distribution of the attribute objects in each cluster. The objects which are abnormal at data distribution are considered to be the outliers. Ranking and clustering are combined to sig- nificantly improve the accuracy of clustering. The experimental results show that RKBOutlier can effectively detect outliers in bi-typed heterogeneous information network.
出处 《电子学报》 EI CAS CSCD 北大核心 2018年第2期281-288,共8页 Acta Electronica Sinica
基金 国家自然科学基金(No.60903098) 吉林大学研究生创新基金(No.2016183 No.2016184)
关键词 离群点检测 排序 聚类 目标对象 属性对象 outlier detection ranking clustering target object attribute object
  • 相关文献

参考文献2

二级参考文献55

  • 1黄毅群,卢正鼎,胡和平,李瑞轩.分布式异常检测中隐私保持问题研究[J].电子学报,2006,34(5):796-799. 被引量:7
  • 2陶新民,陈万海,郭黎利.一种新的基于模糊聚类和免疫原理的入侵监测模型[J].电子学报,2006,34(7):1329-1332. 被引量:6
  • 3邓大勇,黄厚宽,李向军.不一致决策系统中约简之间的比较[J].电子学报,2007,35(2):252-255. 被引量:28
  • 4D Hawkins, Identifications of Outliers[ M ]. London: Chapman and Hall, 1980.
  • 5E Knorr, R Ng. Algorithms for mining distance-based outliers in large datasets[A]. In Proc of the 24th VLDB Conf[C].New York: Morgan Kaufmann, 1998. 392 - 403.
  • 6J W Han,M Damber.Data Mining:Concepts and Technologies [ M]. San Francisco: Morgan Kaufmann, 2001.
  • 7L Kovacs, D Vass, A Vidacs. Improving quality of service parameter prediction with preliminary outlier detection and elimination[A]. Proc of the 2nd Int Workshop on Inter-Domain Performance and Simulation[C]. Budapest,2004. 194- 199.
  • 8P J Rousseeuw,A M Leroy. Robust Regression and Outlier Detection[M]. New York: John Wiley& Sons, 1987.
  • 9T Johnson, I Kwok, R T Ng. Fast computation of 2-dimensional depth contours[ A]. In Proc of the 4th Int Conf on Knowledge Discovery and Data Mining [ C ]. New York: AAAI Press, 1998.224 - 228.
  • 10A K Jain,M N Murty,P J Flynn. Data clustering: a review[J]. ACM Computing Surveys, 1999,31 (3) :264 - 323.

共引文献38

同被引文献84

引证文献6

二级引证文献46

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部