摘要
非负矩阵分解是近年来快速发展的一类机器学习算法,能够实现对高维数据的维度规约及局部特征提取,在诸多生物信息问题的分析与处理中得到了广泛应用,并衍生出一系列实用算法。本文系统分析了非负矩阵分解的数学理论基础及其特有的局部表达属性,综述了标准非负矩阵分解与各种衍生算法的发展历程及算法初始化与参数选取方法的研究进展,并从序列特征分析、表达模式与功能模块识别、生物医学文献挖掘等几个方面总结了非负矩阵分解算法在生物信息学领域的应用成果。最后,指出了非负矩阵分解算法研究及其应用于生物信息处理所面临的问题,分析和预测了可能的发展方向。
Nonnegative Matrix Factorization (NMF) is a rapidly developing parts-based machine learning algorithm, which can be used as a tool of dimensionality reduction and can identify the local features for high-dimensional data. NMF has a broad application in the analysis and interpretation of biological data, and a number of practical algorithms have been derived from it. This paper systematically analyzes the mathematical foundation of NMF and its advantages for the representation of local features, and surveys the advances of different varieties, initialization and parameter selection for the NMF algorithm. Also, its application in bioinformatics is reviewed and classified into several categories. Finally, the future directions of the NMF research and application are analyzed and predicted.
出处
《计算机工程与科学》
CSCD
北大核心
2010年第8期117-123,共7页
Computer Engineering & Science
基金
国家自然科学基金资助项目(60673018)
国家863计划资助项目(2007AA01Z106)
关键词
非负矩阵分解
生物信息学
局部特征
nonnegative matrix factorization
bioinformatics
local feature