期刊文献+
共找到374篇文章
< 1 2 19 >
每页显示 20 50 100
An Efficient Outlier Detection Approach on Weighted Data Stream Based on Minimal Rare Pattern Mining 被引量:1
1
作者 Saihua Cai Ruizhi Sun +2 位作者 Shangbo Hao Sicong Li Gang Yuan 《China Communications》 SCIE CSCD 2019年第10期83-99,共17页
The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional... The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional datasets. In addition, the traditional outlier detection method does not consider the frequency of subsets occurrence, thus, the detected outliers do not fit the definition of outliers (i.e., rarely appearing). The pattern mining-based outlier detection approaches have solved this problem, but the importance of each pattern is not taken into account in outlier detection process, so the detected outliers cannot truly reflect some actual situation. Aimed at these problems, a two-phase minimal weighted rare pattern mining-based outlier detection approach, called MWRPM-Outlier, is proposed to effectively detect outliers on the weight data stream. In particular, a method called MWRPM is proposed in the pattern mining phase to fast mine the minimal weighted rare patterns, and then two deviation factors are defined in outlier detection phase to measure the abnormal degree of each transaction on the weight data stream. Experimental results show that the proposed MWRPM-Outlier approach has excellent performance in outlier detection and MWRPM approach outperforms in weighted rare pattern mining. 展开更多
关键词 outlier detection WEIGHTED data STREAM MINIMAL WEIGHTED RARE pattern mining deviation factors
下载PDF
Constructing Three-Dimension Space Graph for Outlier Detection Algorithms in Data Mining 被引量:1
2
作者 ZHANG Jing 1,2 , SUN Zhi-hui 1 1.Department of Computer Science and Engineering, Southeast University, Nanjing 210096, Jiangsu, China 2.Department of Electricity and Information Engineering, Jiangsu University, Zhenjiang 212001, Jiangsu, China 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期585-589,共5页
Outlier detection has very important applied value in data mining literature. Different outlier detection algorithms based on distinct theories have different definitions and mining processes. The three-dimensional sp... Outlier detection has very important applied value in data mining literature. Different outlier detection algorithms based on distinct theories have different definitions and mining processes. The three-dimensional space graph for constructing applied algorithms and an improved GridOf algorithm were proposed in terms of analyzing the existing outlier detection algorithms from criterion and theory. Key words outlier - detection - three-dimensional space graph - data mining CLC number TP 311. 13 - TP 391 Foundation item: Supported by the National Natural Science Foundation of China (70371015)Biography: ZHANG Jing (1975-), female, Ph. D, lecturer, research direction: data mining and knowledge discovery. 展开更多
关键词 outlier DETECTION three-dimensional space graph data mining
下载PDF
Anomalous Cell Detection with Kernel Density-Based Local Outlier Factor 被引量:2
3
作者 Miao Dandan Qin Xiaowei Wang Weidong 《China Communications》 SCIE CSCD 2015年第9期64-75,共12页
Since data services are penetrating into our daily life rapidly, the mobile network becomes more complicated, and the amount of data transmission is more and more increasing. In this case, the traditional statistical ... Since data services are penetrating into our daily life rapidly, the mobile network becomes more complicated, and the amount of data transmission is more and more increasing. In this case, the traditional statistical methods for anomalous cell detection cannot adapt to the evolution of networks, and data mining becomes the mainstream. In this paper, we propose a novel kernel density-based local outlier factor(KLOF) to assign a degree of being an outlier to each object. Firstly, the notion of KLOF is introduced, which captures exactly the relative degree of isolation. Then, by analyzing its properties, including the tightness of upper and lower bounds, sensitivity of density perturbation, we find that KLOF is much greater than 1 for outliers. Lastly, KLOFis applied on a real-world dataset to detect anomalous cells with abnormal key performance indicators(KPIs) to verify its reliability. The experiment shows that KLOF can find outliers efficiently. It can be a guideline for the operators to perform faster and more efficient trouble shooting. 展开更多
关键词 data mining key performance indicators kernel density-based local outlier factor density perturbation anomalous cell detection
下载PDF
Association discovery and outlier detection of air pollution emissions from industrial enterprises driven by big data
4
作者 Zhen Peng Yunxiao Zhang +1 位作者 Yunchong Wang Tianle Tang 《Data Intelligence》 EI 2023年第2期438-456,共19页
Air pollution is a major issue related to national economy and people's livelihood.At present,the researches on air pollution mostly focus on the pollutant emissions in a specific industry or region as a whole,and... Air pollution is a major issue related to national economy and people's livelihood.At present,the researches on air pollution mostly focus on the pollutant emissions in a specific industry or region as a whole,and is a lack of attention to enterprise pollutant emissions from the micro level.Limited by the amount and time granularity of data from enterprises,enterprise pollutant emissions are stll understudied.Driven by big data of air pollution emissions of industrial enterprises monitored in Beijing-Tianjin-Hebei,the data mining of enterprises pollution emissions is carried out in the paper,including the association analysis between different features based on grey association,the association mining between different data based on association rule and the outlier detection based on clustering.The results show that:(1)The industries affecting NOx and SO2 mainly are electric power,heat production and supply industry,metal smelting and processing industries in Beijing-Tianjin-Hebei;(2)These districts nearby Hengshui and Shijiazhuang city in Hebei province form strong association rules;(3)The industrial enterprises in Beijing-Tianjin-Hebei are divided into six clusters,of which three categories belong to outliers with excessive emissions of total vOCs,PM and NH3 respectively. 展开更多
关键词 Air Pollution Emissions of Enterprises outlier detection based on clustering Association rule mining Grey Association Analysis Big data
原文传递
Outliers Mining in Time Series Data Sets 被引量:3
5
作者 Zheng Binxiang,Du Xiuhua & Xi Yugeng Institute of Automation, Shanghai Jiaotong University,Shanghai 200030,P.R.China 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2002年第1期93-97,共5页
In this paper, we present a cluster-based algorithm for time series outlier mining.We use discrete Fourier transformation (DFT) to transform time series from time domain to frequency domain. Time series thus can be ma... In this paper, we present a cluster-based algorithm for time series outlier mining.We use discrete Fourier transformation (DFT) to transform time series from time domain to frequency domain. Time series thus can be mapped as the points in k -dimensional space.For these points, a cluster-based algorithm is developed to mine the outliers from these points.The algorithm first partitions the input points into disjoint clusters and then prunes the clusters,through judgment that can not contain outliers.Our algorithm has been run in the electrical load time series of one steel enterprise and proved to be effective. 展开更多
关键词 data mining Time series outlier mining.
下载PDF
Combined data mining techniques based patient data outlier detection for healthcare safety 被引量:1
6
作者 Gebeyehu Belay Gebremeskel Chai Yi +1 位作者 Zhongshi He Dawit Haile 《International Journal of Intelligent Computing and Cybernetics》 EI 2016年第1期42-68,共27页
Purpose–Among the growing number of data mining(DM)techniques,outlier detection has gained importance in many applications and also attracted much attention in recent times.In the past,outlier detection researched pa... Purpose–Among the growing number of data mining(DM)techniques,outlier detection has gained importance in many applications and also attracted much attention in recent times.In the past,outlier detection researched papers appeared in a safety care that can view as searching for the needles in the haystack.However,outliers are not always erroneous.Therefore,the purpose of this paper is to investigate the role of outliers in healthcare services in general and patient safety care,in particular.Design/methodology/approach–It is a combined DM(clustering and the nearest neighbor)technique for outliers’detection,which provides a clear understanding and meaningful insights to visualize the data behaviors for healthcare safety.The outcomes or the knowledge implicit is vitally essential to a proper clinicaldecision-making process.The method isimportant to thesemantic,andthe novel tactic of patients’events and situations prove that play a significant role in the process of patient care safety and medications.Findings–The outcomes of the paper is discussing a novel and integrated methodology,which can be inferring for different biological data analysis.It is discussed as integrated DM techniques to optimize its performancein the field of health and medicalscience.It is an integrated method of outliers detection that can be extending for searching valuable information and knowledge implicit based on selected patient factors.Based on these facts,outliers are detected as clusters and point events,and novel ideas proposed to empower clinical services in consideration of customers’satisfactions.It is also essential to be a baseline for further healthcare strategic development and research works.Research limitations/implications–This paper mainly focussed on outliers detections.Outlier isolation that are essential to investigate the reason how it happened and communications how to mitigate it did not touch.Therefore,the research can be extended more about the hierarchy of patient problems.Originality/value–DM is a dynamic and successful gateway for discovering useful knowledge for enhancing healthcare performances and patient safety.Clinical data based outlier detection is a basic task to achieve healthcare strategy.Therefore,in this paper,the authors focussed on combined DM techniques for a deep analysis of clinical data,which provide an optimal level of clinical decision-making processes.Proper clinical decisions can obtain in terms of attributes selections that important to know the influential factors or parameters of healthcare services.Therefore,using integrated clustering and nearest neighbors techniques give more acceptable searched such complex data outliers,which could be fundamental to further analysis of healthcare and patient safety situational analysis. 展开更多
关键词 data mining CLUSTERING Healthcare mining algorithm Nearest neighbor outlier detection
原文传递
Outlier screening for ironmaking data on blast furnaces 被引量:6
7
作者 Jun Zhao Shao-fei Chen +3 位作者 Xiao-jie Liu Xin Li Hong-yang Li Qing Lyu 《International Journal of Minerals,Metallurgy and Materials》 SCIE EI CAS CSCD 2021年第6期1001-1010,共10页
Blast furnace data processing is prone to problems such as outliers.To overcome these problems and identify an improved method for processing blast furnace data,we conducted an in-depth study of blast furnace data.Bas... Blast furnace data processing is prone to problems such as outliers.To overcome these problems and identify an improved method for processing blast furnace data,we conducted an in-depth study of blast furnace data.Based on data samples from selected iron and steel companies,data types were classified according to different characteristics;then,appropriate methods were selected to process them in order to solve the deficiencies and outliers of the original blast furnace data.Linear interpolation was used to fill in the divided continuation data,the Knearest neighbor(KNN)algorithm was used to fill in correlation data with the internal law,and periodic statistical data were filled by the average.The error rate in the filling was low,and the fitting degree was over 85%.For the screening of outliers,corresponding indicator parameters were added according to the continuity,relevance,and periodicity of different data.Also,a variety of algorithms were used for processing.Through the analysis of screening results,a large amount of efficient information in the data was retained,and ineffective outliers were eliminated.Standardized processing of blast furnace big data as the basis of applied research on blast furnace big data can serve as an important means to improve data quality and retain data value. 展开更多
关键词 blast furnace data missing outlierS data processing data mining
下载PDF
Continuous Outlier Monitoring on Uncertain Data Streams 被引量:1
8
作者 曹科研 王国仁 +3 位作者 韩东红 丁国辉 王爱侠 石凌旭 《Journal of Computer Science & Technology》 SCIE EI CSCD 2014年第3期436-448,共13页
Outlier detection on data streams is an important task in data mining. The challenges become even larger when considering uncertain data. This paper studies the problem of outlier detection on uncertain data streams. ... Outlier detection on data streams is an important task in data mining. The challenges become even larger when considering uncertain data. This paper studies the problem of outlier detection on uncertain data streams. We propose Continuous Uncertain Outlier Detection (CUOD), which can quickly determine the nature of the uncertain elements by pruning to improve the efficiency. Furthermore, we propose a pruning approach -- Probability Pruning for Continuous Uncertain Outlier Detection (PCUOD) to reduce the detection cost. It is an estimated outlier probability method which can effectively reduce the amount of calculations. The cost of PCUOD incremental algorithm can satisfy the demand of uncertain data streams. Finally, a new method for parameter variable queries to CUOD is proposed, enabling the concurrent execution of different queries. To the best of our knowledge, this paper is the first work to perform outlier detection on uncertain data streams which can handle parameter variable queries simultaneously. Our methods are verified using both real data and synthetic data. The results show that they are able to reduce the required storage and running time. 展开更多
关键词 outlier detection uncertain data stream data mining parameter variable query
原文传递
Outlier Detection Method based on Hybrid Rough - Negative Algorithm
9
作者 Faizah Shaari Azmi Ahmad Zalizah A.Long 《Journal of Mathematics and System Science》 2014年第6期391-397,共7页
This paper discusses on the detection of outliers by hybridizing Rough_Outlier Algorithm with Negative Association Rules. An optimization algorithm named Binary Particle Swarm Optimization is used to improve the compu... This paper discusses on the detection of outliers by hybridizing Rough_Outlier Algorithm with Negative Association Rules. An optimization algorithm named Binary Particle Swarm Optimization is used to improve the computation of Non_Reduct in order to detect outliers.By using Binary PSO algorithm, the rules generated from Rough_Outliers algorithm is optimized, giving significant outliers object detected. The detection ofoutliers process is then enhanced by hybridizing it with Negative Association Rules. Frequent and Infrequent item sets from outlier rules are generated. Results show that the hybrid Rough_Negative algorithm is able to uncover meaningful knowledge of outliers from the frequent and infrequent item sets. These knowledge can then be used by experts in their field of domain for better decision making. 展开更多
关键词 Negative association rules association rules mining outlier non-reduct infrequent item sets frequent item sets rare.
下载PDF
Outlier-DivideConquer:近似聚集查询中离群分治取样算法 被引量:1
10
作者 胡文瑜 孙志挥 张柏礼 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2011年第5期524-531,共8页
取样是一种通用有效的近似技术,利用取样技术进行近似聚集查询处理是决策支持系统和数据挖掘实现技术中的常用方法.如何正确有效地给出近似查询结果并最小化近似查询误差是近似查询处理的关键和目标.在深入研究近似聚集查询取样方法的... 取样是一种通用有效的近似技术,利用取样技术进行近似聚集查询处理是决策支持系统和数据挖掘实现技术中的常用方法.如何正确有效地给出近似查询结果并最小化近似查询误差是近似查询处理的关键和目标.在深入研究近似聚集查询取样方法的基础上,本文提出了一个有误差确界且只需单遍扫描数据集的离群分治取样Outlier-DivideConquer算法,该算法在聚集属性内部存在高方差分布时能克服随机均匀取样局限,可显著降低近似查询误差,且执行效率优于同类算法.最后通过与传统均匀取样算法的实验比较验证了Outlier-DivideConquer算法的有效性和正确性. 展开更多
关键词 数据挖掘 决策支持 近似聚集查询 均匀取样 离群分治
下载PDF
Meta-path-based outlier detection in heterogeneous information network 被引量:2
11
作者 Lu LIU Shang WANG 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第2期388-403,共16页
Mining outliers in heterogeneous networks is crucial to many applications,but challenges abound.In this paper,we focus on identifying meta-path-based outliers in heterogeneous information network(HIN),and calculate th... Mining outliers in heterogeneous networks is crucial to many applications,but challenges abound.In this paper,we focus on identifying meta-path-based outliers in heterogeneous information network(HIN),and calculate the similarity between different types of objects.We propose a meta-path-based outlier detection method(MPOutliers)in heterogeneous information network to deal with problems in one go under a unified framework.MPOutliers calculates the heterogeneous reachable probability by combining different types of objects and their relationships.It discovers the semantic information among nodes in heterogeneous networks,instead of only considering the network structure.It also computes the closeness degree between nodes with the same type,which extends the whole heterogeneous network.Moreover,each node is assigned with a reliable weighting to measure its authority degree.Substantial experiments on two real datasets(AMiner and Movies dataset)show that our proposed method is very effective and efficient for outlier detection. 展开更多
关键词 data mining HETEROGENEOUS information network outlier detection short TEXT SIMILARITY
原文传递
Outlier detection based on multi-dimensional clustering and local density
12
作者 SHOU Zhao-yu LI Meng-ya LI Si-min 《Journal of Central South University》 SCIE EI CAS CSCD 2017年第6期1299-1306,共8页
Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outl... Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density(ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments. 展开更多
关键词 data mining outlier DETECTION outlier DETECTION method based on MULTI-DIMENSIONAL CLUSTERING and local density (ODBMCLD) algorithm deviation DEGREE
下载PDF
A MapReduced-Based and Cell-Based Outlier Detection Algorithm
13
作者 ZHU Sunjing LI Jing +2 位作者 HUANG Jilin LUO Simin PENG Weiping 《Wuhan University Journal of Natural Sciences》 CAS 2014年第3期199-205,共7页
Outlier detection is a very important type of data mining,which is extensively used in application areas.The traditional cell-based outlier detection algorithm not only takes a large amount of time in processing massi... Outlier detection is a very important type of data mining,which is extensively used in application areas.The traditional cell-based outlier detection algorithm not only takes a large amount of time in processing massive data,but also uses lots of machine resources,which results in the imbalance of the machine load.This paper presents an algorithm of the MapReduce-based and cell-based outlier detection,combined with the single-layer perceptron,which achieves the parallelization of outlier detection.These experiments show that this improved algorithm is able to effectively improve the efficiency of the outlier detection as well as the accuracy. 展开更多
关键词 outlier MapReduce data mining cell massive data
原文传递
基于映射距离比离群因子的离群点检测算法
14
作者 张忠平 姚春辰 +3 位作者 孙光旭 刘硕 张睿博 魏永辉 《计算机集成制造系统》 EI CSCD 北大核心 2024年第5期1719-1732,共14页
针对基于邻近性的离群点检测方法需要花费大量时间过滤正常点,并且在检测全局离群点时难以检测出局部离群点的问题,提出一种基于映射距离比离群因子离群点检测(MDROF)算法。首先,为了减少正常点在检测过程中的时间消耗,给出了差异相似... 针对基于邻近性的离群点检测方法需要花费大量时间过滤正常点,并且在检测全局离群点时难以检测出局部离群点的问题,提出一种基于映射距离比离群因子离群点检测(MDROF)算法。首先,为了减少正常点在检测过程中的时间消耗,给出了差异相似度的概念,通过定义差异相似度剪枝因子过滤掉数据集中的大部分正常点。其次,定义映射k距离,通过映射距离与可达距离的比值刻画数据对象的局部离群程度,通过可达密度刻画数据对象的全局离群程度。最后,结合数据对象相互近邻点的平均排位定义映射距离比离群因子来检测离群点。在人工数据集以及真实数据集上分别对该算法与其他经典的离群点检测算法在精确率、AUC值和离群点发现曲线上进行实验对比分析。实验结果证明MDROF算法在离群点检测的准确性和稳定性上明显优于对比算法。 展开更多
关键词 数据挖掘 离群点检测 差异相似度剪枝 映射k距离 映射距离比
下载PDF
基于机器学习的聚类序列离群点数据挖掘算法
15
作者 王彩霞 陶健 舒升 《通化师范学院学报》 2024年第8期28-34,共7页
由于聚类序列离群点数据具有时序依赖性特征,难以精准检测离群点,导致数据挖掘效果不理想.针对该问题,提出了基于机器学习的聚类序列离群点数据挖掘算法,利用机器学习方法进行聚类序列离群点数据聚类处理,计算离群点离群指数;通过机器... 由于聚类序列离群点数据具有时序依赖性特征,难以精准检测离群点,导致数据挖掘效果不理想.针对该问题,提出了基于机器学习的聚类序列离群点数据挖掘算法,利用机器学习方法进行聚类序列离群点数据聚类处理,计算离群点离群指数;通过机器学习聚合数据,分配离群点数据;遍历数据样本特征序列,计算特征区间适用度,分析特征与目标变量之间关系;将数据分类挖掘问题转换为线性可分问题,避免出现过拟合;设计数据挖掘过程,根据记录每个数据点出现的时间戳,实现数据挖掘.实验结果表明:该算法只是在PSLG数据集与实际离群点占比出现了1%的误差,其余均一致,数据挖掘范围与标定范围一致,具有精准挖掘效果. 展开更多
关键词 机器学习 聚类序列 离群点 数据挖掘
下载PDF
基于关联规则的局部离群数据挖掘算法设计
16
作者 王玲风 《佳木斯大学学报(自然科学版)》 CAS 2024年第6期18-21,共4页
针对现有挖掘算法在对局部离散数据挖掘时,存在挖掘结果关联度低、挖掘效率低的问题,引入关联规则,开展对局部离群数据挖掘算法设计研究。对需要挖掘的局部离散数据预处理,包括数据清洗、数据集成等。针对局部离散数据中的高维数据,提... 针对现有挖掘算法在对局部离散数据挖掘时,存在挖掘结果关联度低、挖掘效率低的问题,引入关联规则,开展对局部离群数据挖掘算法设计研究。对需要挖掘的局部离散数据预处理,包括数据清洗、数据集成等。针对局部离散数据中的高维数据,提出一种基于属性相关分析方法,实现聚类。确定挖掘算法中的离群因子与链距离。最后,结合关联规则,实现对局部离散数据的并行挖掘。通过对比实验证明,新的挖掘算法挖掘结果关联度更高,且挖掘效率高,具备极高应用价值。 展开更多
关键词 关联规则 离群 算法 挖掘 数据 局部
下载PDF
基于改进K-means聚类算法的网络异常数据挖掘与分类方法
17
作者 贺萌 《无线互联科技》 2024年第18期119-122,共4页
为了解决网络异常数据挖掘过程中漏报率、误报率较高的问题,文章提出一种基于改进K-means聚类算法的网络异常数据挖掘与分类方法。文章通过构建并行化频繁项集挖掘环境加速数据处理,利用局部离群点检测剔除异常值,同时引入K-means聚类... 为了解决网络异常数据挖掘过程中漏报率、误报率较高的问题,文章提出一种基于改进K-means聚类算法的网络异常数据挖掘与分类方法。文章通过构建并行化频繁项集挖掘环境加速数据处理,利用局部离群点检测剔除异常值,同时引入K-means聚类对数据的最大最小距离展开计算,融合隶属度函数与密度峰值优化算法,改进聚类初始中心选择及簇边界调整,从而提高异常识别准确性和分类效率。通过实验结果证明,该方法能够明显改善聚类效果与性能。 展开更多
关键词 K-MEANS聚类算法 网络异常 数据挖掘 数据分类 离群点检测
下载PDF
离群点检测算法综述
18
作者 孔翎超 刘国柱 《计算机科学》 CSCD 北大核心 2024年第8期20-33,共14页
离群点检测作为数据挖掘领域的一个重要研究方向,其目的是发掘隐藏在数据集合中与众不同且具有潜在分析价值的数据,辅助研究人员甄别数据源可能存在的问题。目前,离群点检测已被广泛应用于欺诈识别、智慧医疗、入侵检测、故障诊断等诸... 离群点检测作为数据挖掘领域的一个重要研究方向,其目的是发掘隐藏在数据集合中与众不同且具有潜在分析价值的数据,辅助研究人员甄别数据源可能存在的问题。目前,离群点检测已被广泛应用于欺诈识别、智慧医疗、入侵检测、故障诊断等诸多领域。文中在总结前人经验的基础上,首先讨论离群点的定义、产生原因以及典型应用领域,综述了DBSCAN和LOF等离群点检测经典算法及其改进算法的优势和局限,分析了深度学习方法在离群点检测领域的优势;其次结合当前互联网背景下海量、高维、时序数据处理需求,对离群点检测算法在新环境下的发展状况做进一步研究;最后介绍离群点检测算法的评价指标、代价因子在离群点检测评价中的作用以及常用工具包和数据集,总结展望了离群点检测面临的挑战和未来的发展方向。 展开更多
关键词 离群点 异常检测 深度学习 时序数据 数据挖掘
下载PDF
基于无监督学习的异质网络多尺度离群点挖掘研究
19
作者 朱辉 张莉芸 《现代电子技术》 北大核心 2024年第12期182-186,共5页
现有的异质网络多尺度离群点挖掘算法忽略了数据点之间的顺序关系,无法充分利用数据点在异质网络中的排列顺序信息,从而导致聚类精度下降。对此,提出一种基于无监督学习的异质网络多尺度离群点挖掘方法,对异质网络的多节点、多边特点进... 现有的异质网络多尺度离群点挖掘算法忽略了数据点之间的顺序关系,无法充分利用数据点在异质网络中的排列顺序信息,从而导致聚类精度下降。对此,提出一种基于无监督学习的异质网络多尺度离群点挖掘方法,对异质网络的多节点、多边特点进行分析。利用季节-趋势时序分解法提取异质网络数据特征。根据数据特征,结合K-means聚类算法与排序算法,将数据点的排序信息添加至聚类过程中,以实现对异质网络数据离群点的挖掘。实验结果表明,利用该方法进行网络数据节点聚类的准确率均能达到80%以上;并且实现了多尺度离群点挖掘后,能够精准地识别出离群点,为后续的网络通信维护提供了良好的保障。 展开更多
关键词 异质网络 多尺度 离群点挖掘 无监督学习 K均值聚类 网络数据 离群因子
下载PDF
基于模糊邻域熵的离群点检测方法
20
作者 刘佳莉 陈锦坤 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第3期511-522,共12页
离群点检测(又称异常点检测)是数据挖掘领域中一个重要的研究方向,其目的是找出显著区别于其他数据的数据点.针对基于传统粗糙集理论的离群点检测方法存在忽略样本的模糊性和邻域关系等问题,利用模糊邻域粗糙集弥补经典粗糙集的不足,并... 离群点检测(又称异常点检测)是数据挖掘领域中一个重要的研究方向,其目的是找出显著区别于其他数据的数据点.针对基于传统粗糙集理论的离群点检测方法存在忽略样本的模糊性和邻域关系等问题,利用模糊邻域粗糙集弥补经典粗糙集的不足,并结合熵的不确定性,提出一种新的基于模糊邻域熵的离群点检测方法.首先,采用模糊邻域半径和混合模糊相似度构造模糊邻域近似空间;然后,定义一种特定的模糊邻域组合熵和相对模糊邻域组合熵来构建模糊邻域离群度,进而定义基于模糊邻域熵的离群因子实现离群点检测,并设计了基于模糊邻域熵的离群点检测算法(FNEOD).最后,将FNEOD算法与主要的离群点检测算法进行比较.实验结果表明,该方法具有较好的有效性和适应性. 展开更多
关键词 数据挖掘 离群点检测 模糊邻域组合熵 相对模糊邻域组合熵
下载PDF
上一页 1 2 19 下一页 到第
使用帮助 返回顶部