期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
一种基于多属性权重的分类数据子空间聚类算法 被引量:19
1
作者 庞宁 张继福 秦啸 《自动化学报》 EI CSCD 北大核心 2018年第3期517-532,共16页
采用多属性频率权重以及多目标簇集质量聚类准则,提出一种分类数据子空间聚类算法.该算法利用粗糙集理论中的等价类,定义了一种多属性权重计算方法,有效地提高了属性的聚类区分能力;在多目标簇集质量函数的基础上,采用层次凝聚策略,迭... 采用多属性频率权重以及多目标簇集质量聚类准则,提出一种分类数据子空间聚类算法.该算法利用粗糙集理论中的等价类,定义了一种多属性权重计算方法,有效地提高了属性的聚类区分能力;在多目标簇集质量函数的基础上,采用层次凝聚策略,迭代合并子簇,有效地度量了各类尺度的聚类簇;利用区间离散度,解决了使用阈值删除噪音点所带来的参数问题;利用属性对簇的依附程度,确定了聚类簇的属性相关子空间,提高了聚类簇的可理解性.最后,采用人工合成、UCI和恒星光谱数据集,实验验证了该聚类算法的可行性和有效性. 展开更多
关键词 分类数据聚类 多属性频率 多目标簇集质量 属性相关子空间 区间离散度
下载PDF
基于属性分组的子空间聚类算法研究
2
作者 庞宁 靳黎忠 《西南民族大学学报(自然科学版)》 CAS 2023年第6期653-660,共8页
针对分类数据,基于属性分组技术和多目标聚类质量函数,提出一种子空间聚类算法.该算法采用属性分组技术,将高相关属性划分到同属性组中,利用同组属性相关性度量属性权重值,构建属性软子空间;采用基于多目标的聚类质量函数,判断整体聚类... 针对分类数据,基于属性分组技术和多目标聚类质量函数,提出一种子空间聚类算法.该算法采用属性分组技术,将高相关属性划分到同属性组中,利用同组属性相关性度量属性权重值,构建属性软子空间;采用基于多目标的聚类质量函数,判断整体聚类效果,通过迭代优化簇集结构,达到最佳的数据划分状态.在人工合成数据集和UCI数据集上,实验验证了该算法的正确性、高效性和可靠性. 展开更多
关键词 属性分组 多目标质量函数 属性子空间 分类数据聚类
下载PDF
基于数据挖掘的知识获取与发现 被引量:10
3
作者 秦国锋 李启炎 《计算机工程》 CAS CSCD 北大核心 2003年第21期20-22,共3页
利用数据挖掘技术,提出一种从局部模式向全局模式进行数据融合的模型,并对 局部模式的数据挖掘进行了探讨,提出基于事实的物理维度和基于事实数据信息的两种不同 出发点的分类聚类模型与算法,并对两者作出了比较,结果是在实际应用... 利用数据挖掘技术,提出一种从局部模式向全局模式进行数据融合的模型,并对 局部模式的数据挖掘进行了探讨,提出基于事实的物理维度和基于事实数据信息的两种不同 出发点的分类聚类模型与算法,并对两者作出了比较,结果是在实际应用中均能较好地解决 问题,能起到辅助决策的功能。 展开更多
关键词 模糊理论 数据融合 数据分类 知识发现 模糊决策
下载PDF
Gaussian mixture models for clustering and classifying traffic flow in real-time for traffic operation and management 被引量:1
4
作者 孙璐 张惠民 +3 位作者 高荣 顾文钧 徐冰 陈鲤梁 《Journal of Southeast University(English Edition)》 EI CAS 2011年第2期174-179,共6页
Based on Gaussian mixture models(GMM), speed, flow and occupancy are used together in the cluster analysis of traffic flow data. Compared with other clustering and sorting techniques, as a structural model, the GMM ... Based on Gaussian mixture models(GMM), speed, flow and occupancy are used together in the cluster analysis of traffic flow data. Compared with other clustering and sorting techniques, as a structural model, the GMM is suitable for various kinds of traffic flow parameters. Gap statistics and domain knowledge of traffic flow are used to determine a proper number of clusters. The expectation-maximization (E-M) algorithm is used to estimate parameters of the GMM model. The clustered traffic flow pattems are then analyzed statistically and utilized for designing maximum likelihood classifiers for grouping real-time traffic flow data when new observations become available. Clustering analysis and pattern recognition can also be used to cluster and classify dynamic traffic flow patterns for freeway on-ramp and off-ramp weaving sections as well as for other facilities or things involving the concept of level of service, such as airports, parking lots, intersections, interrupted-flow pedestrian facilities, etc. 展开更多
关键词 traffic flow patterns Gaussian mixture model level of service data mining cluster analysis CLASSIFIER
下载PDF
AN IMPROVED ALGORITHM FOR SUPERVISED FUZZY C-MEANS CLUSTERING OF REMOTELY SENSED DATA 被引量:1
5
作者 ZHANG Jingxiong Roger P Kirby 《Geo-Spatial Information Science》 2000年第1期39-44,共6页
This paper describes an improved algorithm for fuzzy c-means clustering of remotely sensed data, by which the degree of fuzziness of the resultant classification is de- creased as comparing with that by a conventional... This paper describes an improved algorithm for fuzzy c-means clustering of remotely sensed data, by which the degree of fuzziness of the resultant classification is de- creased as comparing with that by a conventional algorithm: that is, the classification accura- cy is increased. This is achieved by incorporating covariance matrices at the level of individual classes rather than assuming a global one. Empirical results from a fuzzy classification of an Edinburgh suburban land cover confirmed the improved performance of the new algorithm for fuzzy c-means clustering, in particular when fuzziness is also accommodated in the assumed reference data. 展开更多
关键词 remotely sensed data (images) CLASSIFICATION fuzzyc-means clustering fuzzy membership values (FMVs) Mahalanobis distances covariance matrix
下载PDF
Watershed classification by remote sensing indices: A fuzzy c-means clustering approach 被引量:10
6
作者 Bahram CHOUBIN Karim SOLAIMANI +1 位作者 Mahmoud HABIBNEJAD ROSHAN Arash MALEKIAN 《Journal of Mountain Science》 SCIE CSCD 2017年第10期2053-2063,共11页
Determining the relatively similar hydrological properties of the watersheds is very crucial in order to readily classify them for management practices such as flood and soil erosion control. This study aimed to ident... Determining the relatively similar hydrological properties of the watersheds is very crucial in order to readily classify them for management practices such as flood and soil erosion control. This study aimed to identify homogeneous hydrological watersheds using remote sensing data in western Iran. To achieve this goal, remote sensing indices including SAVI, LAI, NDMI, NDVI and snow cover, were extracted from MODIS data over the period 2000 to 2015. Then, a fuzzy method was used to clustering the watersheds based on the extracted indices. A fuzzy c-mean(FCM) algorithm enabled to classify 38 watersheds in three homogeneous groups.The optimal number of clusters was determined through evaluation of partition coefficient, partition entropy function and trial and error. The results indicated three homogeneous regions identified by the fuzzy c-mean clustering and remote sensing product which are consistent with the variations of topography and climate of the study area. Inherently,the grouped watersheds have similar hydrological properties and are likely to need similar management considerations and measures. 展开更多
关键词 Karkheh watershed Fuzzy c-means clustering Watershed classification Homogeneous sub-watersheds
下载PDF
A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
7
作者 Ohn Mar San Van-Nam Huynh Yoshiteru Nakamori 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2003年第4期562-571,共10页
Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications fr... Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications frequently involve many datasets that also consists ofmixed numeric and categorical attributes. In this paper we present a clustering algorithm which isbased on the k-means algorithm. The algorithm clusters objects with numeric and categoricalattributes in a way similar to k-means. The object similarity measure is derived from both numericand categorical attributes. When applied to numeric data, the algorithm is identical to the k-means.The main result of this paper is to provide a method to update the 'cluster centers' of clusteringobjects described by mixed numeric and categorical attributes in the clustering process to minimizethe clustering cost function. The clustering performance of the algorithm is demonstrated with thetwo well known data sets, namely credit approval and abalone databases. 展开更多
关键词 cluster analysis numeric data categorical data k-means algorithm
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部