基于数据密度感知的非平衡数据模糊聚类方法

Imbalanced Fuzzy Clustering Method Based on Data Density Perception

下载PDF

导出

摘要非平衡数据分析是数据领域的重要问题之一,其类间分布的巨大差异给聚类方法带来严峻挑战.围绕非平衡数据聚类问题,分析了非平衡数据对模糊聚类方法的影响,提出了基于密度感知的模糊聚类方法.方法将数据分布密度特征嵌入模糊聚类初始化过程中,用于定位初始聚类中心点,避免了少数类中心点位置的消失,在此基础上进一步设计了基于密度的模糊聚类优化更新方法.经数据集分析验证,本研究方法能够有效解决非平衡数据分类中少数类消失问题,并且在聚类算法性能上比传统方法有明显提高. Imbalanced data analysis is a key part in biomedical areas but poses a computational challenge for clustering methods due to the huge differences in the distribution between categories. This paper dis-cusses the effects of imbalanced datasets on fuzzy clustering method based on imbalanced data clustering, and proposes a data-density-aware fuzzy clustering method to solve this problem .Specifically, a dataset is segmented into different areas with similar local density, and then a novel fuzzy clustering algorithm is im- plemented based on the initial partition ？ As a result, the initial clustering center point can be located and the disappearance of the minority class central point can be avoided. An updated method is further opti-mized based on data-density-aware fuzzy clustering, which is based on the above mentioned initial density method. The experimental results show that our method can better deal with the disappearance of the minor-ity class in imbalanced datasets classification and compared with the traditional FCM, the clustering algo-rithm performance of the new FCM is obviously enhanced.

作者王进游磊黎忠文苗放 WANG Jin;YOU Lei;LI Zhongwen;MIAO Fang(School of Information Science and Engineering, Chengdu University, Chengdu 610106, China;Institute of Big Data, Chengdu University, Chengdu 610106, China)

机构地区成都大学信息科学与工程学院成都大学大数据研究院

出处《成都大学学报（自然科学版）》 2017年第4期373-376,共4页 Journal of Chengdu University（Natural Science Edition）

基金四川省教育厅自然科学基金(17ZA0082)资助项目

关键词模糊聚类分布密度非平衡数据 FCM distribution density imbalanced dataset

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

1张璐,傅文渊.基于云数据的审计信息化发展趋势研究[J].全国流通经济,2017(19):107-108. 被引量：1
2数据领域工作岗位大揭秘[J].黄金时代（上半月）,2017,0(9):22-23.
3程书睿,陈翰林,胡荣春.基于模糊聚类的施工升降梯内部人数统计[J].现代计算机（中旬刊）,2017(12):35-40. 被引量：1
4何飞,陈杰,蒋昌波,赵静.植物带影响下孤立波沿程波高衰减特性试验[J].水利水电科技进展,2018,38(1):75-82. 被引量：10
5段尧清,陈玲,徐玲.中外政府开放数据领域的研究热点与前沿分析[J].情报科学,2017,35(11):89-93. 被引量：2
6杨志,卢敏童.基于FCM-GRA的故障诊断方法研究[J].机械工程师,2018(1):21-22.
7赵娜.基于支持向量机建筑结构损伤识别方法研究[J].黑龙江科技信息,2017(13):125-125.
8仇博.高校音乐教育与音乐审美[J].戏剧之家,2017(21):200-200. 被引量：2
9陈武权.江西省环保大数据平台建设思考[J].江西科学,2017,35(6):997-1000. 被引量：3
10戴启刚,许可,黄昊頔,祁贤,王慎骄,余慧燕,邓斐,霍翔.Bootstrap在流感监测预警中的应用[J].南京医科大学学报（自然科学版）,2017,37(10):1265-1268. 被引量：4

成都大学学报（自然科学版）

2017年第4期

浏览历史

内容加载中请稍等...

基于数据密度感知的非平衡数据模糊聚类方法

相关作者

相关机构

相关主题

浏览历史