一种基于密度的并行聚类算法

Efficient parallel clustering algorithm based on density

下载PDF

导出

摘要针对微阵列基因表达数据聚类的高维复杂性,提出了一种基于密度的并行聚类算法,在APRAM模型的分布式存储系统中,通过欧几里德距离矩阵和密度函数两次时间复杂度为O(np2)的计算,可使聚类过程的时间复杂度为O(npK),以增加一次计算的代价来降低聚类过程的时间复杂度。基于8结点的机群计算实验表明:本算法能够达到较同类算法更高的并行加速比,提高高维生物数据的聚类速度。 Aim at the high complexity of the gene expression data clustering,puts forward a parallel clustering algorithms based on the density.Uses MPI under the APRAM model,passing two compute with parallel time complexity is O（n^2/P ） that of the P Euclidean distance matrix and the density function,can make the parallel time complexity of clustering be O（nK/P）,reduces the P time complexity of clustering through adding one compute.The experiment based on eight nodes indicates that this algorithm can attain higher parallel accelerate ratio than the same kind algorithm,raise the clustering rate of the high dimension living data.

作者毛韶阳李肯立

机构地区湖南人文科技学院数学系湖南大学计算机与通信学院

出处《计算机工程与应用》 CSCD 北大核心 2007年第30期157-161,共5页 Computer Engineering and Applications

基金国家自然科学基金(the National Natural Science Foundation of China under Grant No.60603053) 教育部重点项目(No.105128)。

关键词并行运算 APRAM模型划分聚类密度函数时间复杂度 parallel computing APRAM model partition-clustering density function time complexity

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献14

1Schena M,Shalon D,Davis R W,et al.Quantitative monitoring of gene expression patterns with a complementary DNA micro array[J]. Science, 1995,270( 5235 ) : 467-470.
2Olson C F.Parallel algorithms for hierarchical clustering[J].Parallel Computing, 1995,21 : 1313-1325.
3Chiu S L.Fuzzy model identification based on cluster estimation[J]. Journal of Intelligent and Fuzzy Systems, 1994,2(3):267-278.
4Bezdek J C.Pattern recognition with fuzzy objective function algorithms[M].New York:Plenum Press, 1987.
5Ken-LiLi,Ren-FaLi,Qing-HuaLi.Optimal Parallel Algorithm for the Knapsack Problem Without Memory Conflicts[J].Journal of Computer Science & Technology,2004,19(6):760-768. 被引量：11
6Li X.Parallel algorithms for hierarchical clustering and clustering validity [J].IEEE Trans Pattern Analysis and Machine Intelligence, 1990,12:1088-1092.
7Herrero J.A hierarchical unsupervised growing neural network for clustering gene expression patterns [J].Bioinformatics, 2001,17 (2) : 126-136.
8Wang H,Wang W,Yang J,et al.Clustering by pattern similarity in large data sets [C]//Proceedings of ACM SIGMOD International Conference on Management of Data,2002.
9Tsai H R,Horng S J,Lee S S,et al.Parallel hierarchical clustering algorithms on processor arrays with a recontigurable bus system[J].Pattern Recognition, 1997,30:801-815.
10Chazelle B.A minimum spanning tree algorithm with inverse-ackerman type complexity[J].ACM J, 2000,47( 6 ) : 1028-1047.

二级参考文献17

1Garey M R, Johnson D S. Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco, W H Freeman and Co., 1979.
2Shamir A. A polynomial-time algorithm for breaking the basic Merkle-Hellman cryptosystem. IEEE Trans. Inform. Theory, 1984, 30(5): 699-704.
3Chor B, Rivest R L. A knapsack-type public key cryptosystem based on arithmetic in finite fields. IEEE Trans. Inform. Theory, 1988, 34(5): 901-909.
4Laih C S, Lee J Y, Harn L, Su Y K. Linearly shift knapsack public-key cryptosystem. IEEE J. Selected Areas Commun.., 1989, 7(4): 534-539.
5Horowitz E, Sahni S. Computing partitions with applications to the knapsack problem. J. ACM, 1974, 21(2):277-292.
6Aardal K, Bixby R E, Hurkens C A J, Lenstra A K et al. Market split and basis reduction: Towards a solution of the Cornuejils-Dawande instances. In The 7th International IPCO Conference, 1999, Lecture Notes in Computer Science 1610, pp.1-16.
7Schroeppel R, Shamir A. A T = 0(2^n/^2), S = 0(2^n/^4)algorithm for certain NP-complete problems. SIAM J.Comput., 1981, 10(3): 456-464.
8Ferreira A G. A parallel time/hardware tradeoff T. H=0(2^n/^2) for the knapsack problem. IEEE Trans. Comput., 1991, 40(2): 221-225.
9Karnin E D. A parallel algorithm for the knapsack problem. IEEE Trans, Comput., 1984, 33(5): 404-408.
10Amirazizi H R, Hellman M E. Time-memory -processor tradeoffs. IEEE Trans. Inform. Theory, 1988, 34(3):502-512.

共引文献10

1唐小勇,唐小勇,李肯立,PADUA Divid.考虑通信竞争的任意处理机网络表调度算法[J].中国科学（F辑:信息科学）,2009,39(7):704-714.
2李肯立,李仁发,李庆华.背包问题无存储冲突的并行三表算法[J].计算机学报,2006,29(2):345-352. 被引量：4
3刘晓玲,李肯立,郑光勇.基于采样和MIMD结构的背包问题并行算法[J].计算机工程与科学,2006,28(9):100-102.
4李肯立,赵欢,李仁发,李庆华.背包类问题的并行O(2^(5n/6))时间-空间-处理机折衷(英文)[J].软件学报,2007,18(6):1319-1327.
5李肯立,姚凤娟,李仁发,许进.基于分治的背包问题DNA计算机算法[J].计算机研究与发展,2007,44(6):1063-1070. 被引量：20
6潘果,李肯立,刘完芳.基于分治的子集积问题DNA计算机算法[J].计算机工程与科学,2007,29(8):74-78. 被引量：1
7毛韶阳,李肯立.一种基因数据的聚类并行算法研究[J].微电子学与计算机,2007,24(9):130-133. 被引量：1
8李肯立,姚凤娟,许进,李仁发.子集和问题的O（1．414^n）链数DNA计算机算法[J].计算机学报,2007,30(11):1947-1953. 被引量：3
9江华,谭新星,李祥.一种求解背包问题的自适应算法[J].计算机工程,2008,34(4):7-9. 被引量：2
10王剑波.基于质粒模型的DNA计算机算法求解背包问题[J].湖南人文科技学院学报,2010,27(4):77-79. 被引量：4

1毛韶阳,李肯立.一种基因数据的聚类并行算法研究[J].微电子学与计算机,2007,24(9):130-133. 被引量：1
2钱少先.关于并行计算的若干问题[J].安庆师范学院学报（自然科学版）,2001,7(2):44-45.
3刘青,杨小涛.基于支持向量机的微阵列基因表达数据分析方法[J].小型微型计算机系统,2005,26(3):363-366. 被引量：8
4刘青,周鹏.基于强泛化神经网络的大规模基因表达数据分析[J].计算机工程,2005,31(3):189-191. 被引量：1
5李宏,陈松乔,易丽君,周明,李翔.基于闭合模式的高维生物数据分类算法研究[J].小型微型计算机系统,2007,28(8):1423-1426. 被引量：1
6王丽美,蔡剑锋,钟一文,彭富强.基于并行的非支配排序遗传Ⅱ算法优化双聚类[J].大理学院学报（综合版）,2014,13(12):15-21. 被引量：1
7石玉.基于惩罚高斯混合模型的微阵列基因表达数据分析[J].中山大学学报（自然科学版）,2009,48(3):1-7.
8韩利,祁云嵩,王俊.基于粗糙集的支持向量机微阵列数据分类方法[J].科学技术与工程,2009,9(1):152-155. 被引量：4

计算机工程与应用

2007年第30期

浏览历史

内容加载中请稍等...

一种基于密度的并行聚类算法

参考文献14

二级参考文献17

共引文献10

相关作者

相关机构

相关主题

浏览历史