期刊文献+

基于精简关联度的基因表达数据迭代填补算法 被引量:3

Iterative Imputation Algorithm Based on Reduced Relational Grade for Gene Expression Data
下载PDF
导出
摘要 基因表达数据时常出现缺失,阻碍了对基因表达的研究。提出了一种新的相似性度量方案——精简关联度,在此基础上,又提出了基于精简关联度的缺失数据迭代填补算法(RKNNimpute)。精简关联度是对灰色关联度的一种改进,能达到与灰色关联度同样的效果,却显著降低了算法的时间复杂度。RKNNimpute算法以精简关联度作为相似度量,将填补后的基因扩充到近邻的候选基因集,通过迭代的方式填补其他缺失数据,提高了算法的填补效果和性能。选用时序、非时序、混合等不同类型的基因表达数据集进行了大量实验来评估RKNNimpute算法的性能。实验结果表明,精简关联度是一种高效的距离度量方法,所提出的RKNNimpute算法优于常规填补算法。 Gene expression data frequently suffers from missing value, which adversely affects downstream analysis. A new similarity metric method named reduced relational grade was proposed. Based on this, we presented the iterative im- putation algorithm for gene expression data (RKNNimpute). Reduced relational grade is an improvement of gray rela- tional grade. The former can achieve the same performance as the latter while greatly reducing the time complexity. RKNNimpute imputes missing value iteratively by considering the reduced relational grade as similarity metric and ex- panding the set of candidate genes to nearest neighbors with imputed genes, which improves the effect and performance of the imputation algorithm. We selected data sets of different kind, such as time series, non-time series and mixed, and then experimentally evaluated the proposed method. The results demonstrate that the reduced relational grade is effec- tive and RKNNimpute outperforms common imputation algorithms.
作者 何云 皮德常
出处 《计算机科学》 CSCD 北大核心 2015年第11期251-255,283,共6页 Computer Science
基金 国家自然科学基金(U1433116) 江苏省"333"高层次人才工程 航空科学基金(20145752033)资助
关键词 基因表达数据 精简关联度 填补 迭代 缺失值 Gene expression data, Reduced relational grade, Imputation, Iteration, Missing value
  • 相关文献

参考文献13

  • 1Hoheisel J D. Microarray technology: beyond transcript profilingand genotype analysis [J]. Nature Reviews Genetics,2006. 7(3):200-210.
  • 2De Brevern A G, Hazout S, Malpertuy A. Influence of microar-rays experiments missing values on the stability of gene groupsby hierarchical clustering [J]. BMC Bioinformatics, 2004,5(1):114-119.
  • 3Yang Y H,Buckley M J.Dudoit S,et al. Comparison of methodsfor image analysis on cDNA microarray data [J]. Journal ofComputational and Graphical Statistics, 2002,11(1): 108-136.
  • 4Pedro J, Garcia-Laencina,et al. K nearest neighbours with mutualinformation for simultaneous classification and missing data im-putation [J]. Neurocomputing, 2009,72(7-9) : 1483-1493.
  • 5Moorthy K, Mohamad M S, Deris S. A Review on Missing ValueImputation Algorithms for Microarray Gene Expression Data[J]. Current Bioinformatics.2014.9(l) : 18-22.
  • 6Song Qin-bao,Shepperd M,Chen Xiang-ru,et al. Can k-NN im-putation improve the performance of C4. 5 with small softwareproject data sets. A comparative evaluation [J]. Journal of Sys-tems and Software,2008,81(12) : 2361-2370.
  • 7Troyanskaya O,Cantor M.Sherlock G. Missing value estimationmethods for DNA microarrays [J]. Bioinformatics.2001,17(6):520-525.
  • 8Alan Wee-Chung,Law Ngai-Fong, Yan Hong. Missing value im-putation for gene expression data: computational technique torecover missing data from available information [J]. Briefings inBioinformatics,2010,12(5) :498-513.
  • 9MengFan-chi, Cheng Cai, Hong Yan, A Bicluster-Based Baye-sian Principal Component Analysis Method for Microarray Mis-sing Value Estimation [J]. Biomedical and Health Informatics,2014,18(3):862-871.
  • 10Zhang Shi-chao. Shell-neighbor method and its application inmissing data [J]. Applied Intelligence,2011,35(1) : 123-133.

同被引文献22

引证文献3

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部