摘要
针对目前聚类算法在分析DNA序列数据时的低效性和分类精度低问题,提出一种基于蚁群优化聚类算法(ACOC)的DNA序列分类方法,在密度函数中加入自适应感应量并应用模拟退火中的α-适应量的冷却策略,采用DNA序列分布特征对DNA序列进行特征提取,并将pearson相关系数引入蚁群聚类算法作为相似性度量。在EMBL-DNA数据库中4个数据集上进行性能测试,与统计聚类和k-means算法的比较表明,该方法具有一定的时间和精度的优越性,适于解决大规模DNA序列数据分类问题。
A modification of ant-based clustering algorithm for DNA sequence analysis is presented.For increasing the efficiency of ant-based clustering algorithm in terms of running time and accuracy, the modified version of ACOC has incorporated two main modifications in relation to ACA:An adaptive perception scheme occurs in the density function and a cooling scheme of a-adaptation.The features of DNA sequence are extracted according to Di-nucleotide frequency.Then pearson correlation coefficient is used to analyze the relationship.Experimental results on EMBL-DNA datasets clearly show that ACOC performs well when this paper is compared to statistics clustering and k-means and is suitable for Mass DNA sequence classification.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第25期124-126,130,共4页
Computer Engineering and Applications
基金
国家自然科学基金No.60572153~~
关键词
DNA序列分析
蚁群聚类算法
分类
特征提取
person相关系数
DNA sequence analysis
ant-based clustering algorithm
classification
feature extraction
pearson correlation coefficient