期刊文献+
共找到15篇文章
< 1 >
每页显示 20 50 100
基于敏感点颜色聚类和行聚类筛选的文本提取 被引量:3
1
作者 刘琼 周慧灿 王耀南 《计算机应用》 CSCD 北大核心 2010年第2期449-452,共4页
针对现有的文本提取算法不能适应复杂背景变化和文字本身的形状变化问题,提出一种基于敏感点颜色两级聚类和文本行聚类筛选的方法。新方法利用人眼视觉对颜色大幅度变化更敏感的特点,以敏感点的主要颜色作为聚类分析的依据,克服了现有... 针对现有的文本提取算法不能适应复杂背景变化和文字本身的形状变化问题,提出一种基于敏感点颜色两级聚类和文本行聚类筛选的方法。新方法利用人眼视觉对颜色大幅度变化更敏感的特点,以敏感点的主要颜色作为聚类分析的依据,克服了现有阈值方法和聚类方法受背景颜色变化影响较大的问题。在此基础上,以文本行的空间排列特征为依据进进行文本行筛选,以克服一般方法容易受文字形状和尺寸变化影响的缺点。实验表明,新方法对于背景的复杂变化和文字的形状尺寸变化都具有很好的适应性。 展开更多
关键词 文本提取 K均值 边缘密度 文本行聚类
下载PDF
从基因表达数据中挖掘最大的行常量双聚类 被引量:5
2
作者 缪苗 尚学群 +1 位作者 刘加财 王淼 《计算机应用研究》 CSCD 北大核心 2011年第12期4447-4450,共4页
双聚类方法是当前分析基因表达数据的一个重要研究方向,其挖掘目标是发现哪些基因在哪些实验条件下具有相似的表达水平或者关系密切。目前已提出了许多双聚类算法来挖掘不同类型的双聚类,然而其大部分挖掘效率不高。鉴于此,提出了一个... 双聚类方法是当前分析基因表达数据的一个重要研究方向,其挖掘目标是发现哪些基因在哪些实验条件下具有相似的表达水平或者关系密切。目前已提出了许多双聚类算法来挖掘不同类型的双聚类,然而其大部分挖掘效率不高。鉴于此,提出了一个新颖的挖掘算法———MRCluster,其主要是用来从原始的基因表达数据中挖掘最大的行常量双聚类模式。就其挖掘效率来说,它采用的是基于Apriori原则的基因扩展深度优先的挖掘策略,并且在挖掘过程中引入了一些新颖的剪枝技术来提高效率。将MRCluster和一个行常量双聚类模式挖掘方法 RAP(range support pattern)算法进行比较,从实验结果上可以看出,相比RAP算法,MRCluster算法对在原始的基因表达数据中挖掘最大的行常量双聚类模式具有更好的效率。因此,MRCluster算法能够有效地从原始的基因表达数据中挖掘最大的行常量双聚类。 展开更多
关键词 原始数据 常量双 范围支持度 基因芯片
下载PDF
奇异向量空间双聚类算法 被引量:3
3
作者 徐晓华 席艳秋 +2 位作者 潘舟金 陆林 陈崚 《微电子学与计算机》 CSCD 北大核心 2012年第3期78-83,共6页
本文针对0/1矩阵的双聚类问题提出一种奇异向量空间双聚类算法.通过SVD分解将0/1矩阵映射到左右奇异向量空间上,然后利用信息熵判断行聚类优先还是列聚类优先,最后根据判断结果递归进行行聚类或列聚类,直到满足停止条件.实验显示奇异向... 本文针对0/1矩阵的双聚类问题提出一种奇异向量空间双聚类算法.通过SVD分解将0/1矩阵映射到左右奇异向量空间上,然后利用信息熵判断行聚类优先还是列聚类优先,最后根据判断结果递归进行行聚类或列聚类,直到满足停止条件.实验显示奇异向量空间双聚类算法可以分辨出完全无重叠的子矩阵,比较快速地得到硬的双簇. 展开更多
关键词 SVD分解 0/1矩阵 行聚类 布尔矩阵
下载PDF
一种等大小矩形碎纸片拼接还原方法 被引量:5
4
作者 陈玉成 田娇 《厦门理工学院学报》 2014年第3期103-108,共6页
引入边缘相似度概念,利用贪婪算法,解决中英文文件纵向切割后的碎纸片拼接还原问题.对于同时发生纵横向切割的中英文碎纸片,先利用着色反转法对碎纸片文字部分进行反转处理,再利用行聚类筛选法对碎纸片按行匹配度进行分类,最后对每一类... 引入边缘相似度概念,利用贪婪算法,解决中英文文件纵向切割后的碎纸片拼接还原问题.对于同时发生纵横向切割的中英文碎纸片,先利用着色反转法对碎纸片文字部分进行反转处理,再利用行聚类筛选法对碎纸片按行匹配度进行分类,最后对每一类碎纸片利用贪婪算法并辅之以人工干预,将碎纸片拼接还原.单面英文碎纸片拼接还原结果表明,该方法人工干预次数少,还原效率高、效果好. 展开更多
关键词 边缘相似度 行聚类筛选法 着色反转
下载PDF
APPLICATION OF FUZZY INFERENCE IN IDENTIFICATION OF HELICOPTER MODEL
5
作者 宋彦国 张呈林 徐锦法 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2001年第2期124-129,共6页
Helicopter mathematical model mainly depends on design helicopter control system, flight simulator, and real time control simulation system. But it is difficult to establish a helicopter flight dynamics mathematical ... Helicopter mathematical model mainly depends on design helicopter control system, flight simulator, and real time control simulation system. But it is difficult to establish a helicopter flight dynamics mathematical model that has features such as rapidness, reliability and precision, because there is no unique and precise expression to some sophisticated phenomenon of helicopter. In this paper a fuzzy helicopter flight model is constructed based on the flight experimental data. The fuzzy model, which is identified by fuzzy inference, has characteristics of computed rapidness and high precision. In order to guarantee the precision of the identified fuzzy model, a new method is adopted to handle the conflict fuzzy rules. Additionally, using fuzzy clustering technology can effectively reduce the number of rules of fuzzy model, namely, the order of the fuzzy model. The simulation results indicate that the method of this paper is effective and feasible. 展开更多
关键词 helicopter mathematical model fuzzy inference fuzzy clustering flight control
下载PDF
Identifying Similar Operation Scenes for Busy Area Sector Dynamic Management 被引量:2
6
作者 HU Minghua ZHANG Xuan +2 位作者 YUAN Ligang CHEN Haiyan GE Jiaming 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2020年第4期615-629,共15页
Air traffic controllers face challenging initiatives due to uncertainty in air traffic.One way to support their initiatives is to identify similar operation scenes.Based on the operation characteristics of typical bus... Air traffic controllers face challenging initiatives due to uncertainty in air traffic.One way to support their initiatives is to identify similar operation scenes.Based on the operation characteristics of typical busy area control airspace,an complexity measurement indicator system is established.We find that operation in area sector is characterized by aggregation and continuity,and that dimensionality and information redundancy reduction are feasible for dynamic operation data base on principle components.Using principle components,discrete features and time series features are constructed.Based on Gaussian kernel function,Euclidean distance and dynamic time warping(DTW)are used to measure the similarity of the features.Then the matrices of similarity are input in Spectral Clustering.The clustering results show that similar scenes of trend are not ideal and similar scenes of modes are good base on the indicator system.Finally,actual vertical operation decisions for area sector and results of identification are compared,which are visualized by metric multidimensional scaling(MDS)plots.We find that identification results can well reflect the operation at peak hours,but controllers make different decisions under the similar conditions before dawn.The compliance rate of busy operation mode and division decisions at peak hours is 96.7%.The results also show subjectivity of actual operation and objectivity of identification.In most scenes,we observe that similar air traffic activities provide regularity for initiatives,validating the potential of this approach for initiatives and other artificial intelligence support. 展开更多
关键词 air traffic similar scenes unsupervised clustering dynamic operation time series similarity measure
下载PDF
A new clustering algorithm for large datasets 被引量:1
7
作者 李清峰 彭文峰 《Journal of Central South University》 SCIE EI CAS 2011年第3期823-829,共7页
The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between c... The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between clustering aggregation and the problem of correlation clustering.The best deterministic approximation algorithm was provided for the variation of the correlation of clustering problem,and showed how sampling can be used to scale the algorithms for large datasets.An extensive empirical evaluation was given for the usefulness of the problem and the solutions.The results show that this method achieves more than 50% reduction in the running time without sacrificing the quality of the clustering. 展开更多
关键词 data mining Circle algorithm clustering categorical data clustering aggregation
下载PDF
The Wine Chain in Puglia: A Cluster Analysis
8
作者 Contb Francesco Fiore Mariantonietta La Sala Piermichele 《Journal of Agricultural Science and Technology(B)》 2013年第10期696-716,共21页
The food industry is evolving more towards new forms of organization much more complex and characterized by a greater degree of coordination, whether in the form of vertical integration of explicit or implicit contrac... The food industry is evolving more towards new forms of organization much more complex and characterized by a greater degree of coordination, whether in the form of vertical integration of explicit or implicit contracts between players of different levels of the industry. Therefore, the aim of this work is the search for mechanisms that can provide value to the production phase to better increase competitiveness of the sector. For the first time, in fact, discussion about food chains have as reference a recognized legal entity, which is the integrated projects of food chain as a result of actions of agricultural policy at community, national and regional levels. The methodology is related to two steps: the administration of questionnaires to the three companies participating in food chain partnerships that have proposed a draft of integrated design of food chain in response to the notice of the Apulia region for the submission of the integrated projects of the food chain; and a cluster analysis in the wine sector of the Italian regions. The results showed, thanks to Network Analysis, the importance for the chain development of relationships formed by market relations and cooperation relations (formal and informal) and the need for more actions for the enhancement of products by research and development activities. 展开更多
关键词 Apulia wine food chain rural development integrated project of food chain cluster analysis.
下载PDF
Research on Clustering Analysis and Its Application in Customer Data Mining of Enterprise 被引量:1
9
作者 WeiZHAO Xiangying LI Liping FU 《International Journal of Technology Management》 2014年第9期16-19,共4页
The paper study improved K-means algorithm and establish indicators to classify customers according to RFM model. Experimental results show that, the new algorithm has good convergence and stability, it has better tha... The paper study improved K-means algorithm and establish indicators to classify customers according to RFM model. Experimental results show that, the new algorithm has good convergence and stability, it has better than single use of FKP algorithms for clustering. Finally the paper study the application of clustering in customer segmentation of mobile communication enterprise. It discusses the basic theory, customer segmentation methods and steps, the customer segmentation model based on consumption behavior psychology, and the segmentation model is successfully applied to the process of marketing decision support. 展开更多
关键词 K-means clustering optimization customer segmentation RFM model decision support
下载PDF
Behavior Clustering for Anomaly Detection 被引量:1
10
作者 Zhu Xudong Li Hui Liu Zhijing 《China Communications》 SCIE CSCD 2010年第6期17-23,共7页
We presented a novel framework for automatic behavior clustering and unsupervised anomaly detection in a large video set. The framework consisted of the following key components: 1 ) Drawing from natural language pr... We presented a novel framework for automatic behavior clustering and unsupervised anomaly detection in a large video set. The framework consisted of the following key components: 1 ) Drawing from natural language processing, we introduced a compact and effective behavior representation method as a stochastic sequence of spatiotemporal events, where we analyzed the global structural information of behaviors using their local action statistics. 2) The natural grouping of behavior patterns was discovered through a novel clustering algorithm. 3 ) A run-time accumulative anomaly measure was introduced to detect abnormal behavior, whereas normal behavior patterns were recognized when sufficient visual evidence had become available based on an online Likelihood Ratio Test (LRT) method. This ensured robust and reliable anomaly detection and normal behavior recognition at the shortest possible time. Experimental results demonstrated the effectiveness and robustness of our approach using noisy and sparse data sets collected from a real surveillance scenario. 展开更多
关键词 computer vision anomaly detection Hidden Markov Model Latent Dirichlet Allocation
下载PDF
Clustering Approaches for Overhead Reduction over Coordinated Multiple Points Network-MIMO Downlink Systems
11
作者 Xiao Shanghui Zhang Zhongpei Shi Zhiping 《China Communications》 SCIE CSCD 2010年第5期103-111,共9页
Owing to the potential for intercell cochannel interference mitigation and significant spectral efficiency improvement, coordinating transmission techniques by multiple radio access points have recently attracted a lo... Owing to the potential for intercell cochannel interference mitigation and significant spectral efficiency improvement, coordinating transmission techniques by multiple radio access points have recently attracted a lot of attention. In this paper, the system structure and mathematical signal model based on clustered structure are presented for multipoint coordinating downlink transmission, the clustered supercell configurations with static/dynamic approaches are discussed, and then optimal precod- ing design is provided for an accepted level of scheduling complexity and reduced signaling over- head. Some simulation results are given to evaluate the performance of different cell-clustering approaches, and to show that a clustered supercell size of 7 is a reasonable choice for clustered coordination with the given transmit power and the reduced feedback. 展开更多
关键词 overhead reduction clustering approa-ches SUPERCELL MIMO ss STEMS cooperative com-munication
下载PDF
The Methodology of Application Development for Hybrid Architectures
12
作者 Vladimir Orekhov Alexander Bogdanov Vladimir Gaiduchok 《Computer Technology and Application》 2013年第10期543-547,共5页
This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's ... This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's thesis project "Optimization of complex tasks' computation on hybrid distributed computational structures" accomplished by Orekhov during which the main research objective was the determination of" patterns of the behavior of scaling efficiency and other parameters which define performance of different algorithms' implementations executed on hybrid distributed computational structures. Major outcomes and dependencies obtained within the master's thesis project were formed into a methodology which covers the problems of applications based on parallel computations and describes the process of its development in details, offering easy ways of avoiding potentially crucial problems. The paper is backed by the real-life examples such as clustering algorithms instead of artificial benchmarks. 展开更多
关键词 Hybrid architectures parallel computing simulation modeling.
下载PDF
Research on Parallel K-Medoids algorithm based on MapReduce
13
作者 Xianli QIN 《International Journal of Technology Management》 2015年第1期26-28,共3页
In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a ... In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a parallel algorithm MapReduce programming model based on the research of K-Medoids algorithm. This algorithm increase the computation granularity and reduces the communication cost ratio based on the MapReduce model. The experimental results show that the improved parallel algorithm compared with other algorithms, speedup and operation efficiency is greatly enhanced. 展开更多
关键词 K-Medoids MAPREDUCE Parallel computing HADOOP
下载PDF
Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit 被引量:2
14
作者 Ke-shi GE Hua-you SU +1 位作者 Dong-sheng LI Xi-cheng LU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第7期915-927,共13页
The density peak (DP) algorithm has been widely used in scientific research due to its novel and effective peak density-based clustering approach. However, the DP algorithm uses each pair of data points several time... The density peak (DP) algorithm has been widely used in scientific research due to its novel and effective peak density-based clustering approach. However, the DP algorithm uses each pair of data points several times when determining cluster centers, yielding high computational complexity. In this paper, we focus on accelerating the time-consuming density peaks algorithm with a graphics processing unit (GPU). We analyze the principle of the algorithm to locate its computational bottlenecks, and evaluate its potential for parallelism. In light of our analysis, we propose an efficient parallel DP algorithm targeting on a GPU architecture and implement this parallel method with compute unified device architecture (CUDA), called the ‘CUDA-DP platform'. Specifically, we use shared memory to improve data locality, which reduces the amount of global memory access. To exploit the coalescing accessing mechanism of CPU, we convert the data structure of the CUDA-DP program from array of structures to structure of arrays. In addition, we introduce a binary search-and-sampling method to avoid sorting a large array. The results of the experiment show that CUDA-DP can achieve a 45-fold acceleration when compared to the central processing unit based density peaks implementation. 展开更多
关键词 Density peak Graphics processing unit Parallel computing CLUSTERING
原文传递
Evolved clustering analysis of 300 MW boiler furnace pressure sequence based on entropy characterization
15
作者 GU Hui REN ShaoJun +2 位作者 SI FengQi XU ZhiGao ZHAO LingLing 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2016年第4期647-656,共10页
The furnace process is very important in boiler operation,and furnace pressure works as an important parameter in furnace process.Therefore,there is a need to analyze and monitor the pressure signal in furnace.However... The furnace process is very important in boiler operation,and furnace pressure works as an important parameter in furnace process.Therefore,there is a need to analyze and monitor the pressure signal in furnace.However,little work has been conducted on the relationship with the pressure sequence and boiler’s load under different working conditions.Since pressure sequence contains complex information,it demands feature extraction methods from multi-aspect consideration.In this paper,fuzzy c-means analysis method based on weighted validity index(VFCM)has been proposed for the working condition classification based on feature extraction.To deal with the fluctuating and time-varying pressure sequence,feature extraction is taken as nonlinear analysis based on entropy theory.Three kinds of entropy values,extracted from pressure sequence in time-frequency domain,are studied as the clustering objects for work condition classification.Weighted validity index,taking the close and separation degree into consideration,is calculated on the base of Silhouette index and Krzanowski-Lai index to obtain the optimal clustering number.Each time FCM runs,the weighted validity index evaluates the clustering result and the optimal clustering number will be obtained when it reaches the maximum value.Four datasets from UCI Machine Learning Repository are presented to certify the effectiveness in VFCM.Pressure sequences got from a 300 MW boiler are then taken for case study.The result of the pressure sequence case study with an error rate of 0.5332%shows the valuable information on boiler’s load and pressure sequence in furnace.The relationship between boiler’s load and entropy values extracted from pressure sequence is proposed.Moreover,the method can be considered to be a reference method for data mining in other fluctuating and time-varying sequences. 展开更多
关键词 furnace pressure sequence ENTROPY validity index fuzzy c-means analysis method based on weighted validity index
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部