摘要
针对标签SNP选择过程中存在时间复杂度高、重构准确度低以及缺乏生物含义等不足,本文提出了一种基于多位点连锁不平衡的标签SNP选择方法,该方法首先利用最小等位基因频率等指标对数据集进行预处理,排除噪声位点等,然后根据标签SNP选择过程的特点设计并改进了蚁群算法,以获取候选标签子集,最后,为了进一步提高重构准确度,本文以重构准确度为目标,利用支持向量机作为学习模型,采用后向淘汰策略对候选标签子集进行精选.实验结果表明,先过滤再精选的策略,不仅降低了时间复杂度,而且在样本重构准确度也有一定程度优势.
There are still some drawbacks of current tag SNP selection methods, such as high time complexi-ty, the large number of tags and the unsatisfactory reconstruction accuracy. In this study, we propose a se-lection method based on multiple loci linkage disequilibrium measure. Firstly, several criteria are applied to pre-process the data set for excluding noise loci. Secondly, a method based on ant colony algorithm is de-signed to construct the candidate tag SNP set according to the features of tags. Finally, support vector ma-chine is used to accurately reconstruct the samples. The main purpose of the reconstruction process is to fur-ther improve the accuracy and reduce the number of pins of the SNP. The experimental results show that this filter-refine framework efficiently improves the prediction accuracy and time complexity.
出处
《湘南学院学报》
2015年第2期39-43,共5页
Journal of Xiangnan University
基金
湖南省教育厅科研项目(12C0673)