摘要
与传统的硬划分聚类相比,模糊聚类算法(以FCM为例)对数据的比例变化具有鲁棒性,能够更准确地反映数据点与类中心的实际关系,目前已得到广泛应用。然而对于时序基因表达数据来说,传统的聚类算法往往不能充分利用到数据中时间上的动态关联信息。因此可以在模糊聚类算法的基础上引入自回归(AR)模型,将时序基因表达数据作为一组时间序列进行动态的聚类分析。这样不仅可以充分利用到时序基因表达数据的内部自相关性,并且可以进一步利用隶属度函数对AR模型的预测过程进行模糊化调整,从而得到更为理想的聚类结果。
Compared with conventional hard partition clustering algorithms, fuzzy clustering algorithms (for example, FCM) are robust to the scaling transformation ofa dataset. So they have been used widely because real relationship among samples and cluster centers can be reflected better. However, they cannot make full use of the important dynamic information in time-course gene expression data. Accordingly, autoregressive (AR) model can be introduced into fuzzy clustering algorithm, which can analyze a time-course gene expression data as a set of time series dynamically. In this way, the important dynamic information in time-course gene expression data is used adequately. And the forecast processes in AR model is adjusted using the corresponding membership functions, such that better clustering results for time-course gene expression data is obtained
出处
《计算机工程与设计》
CSCD
北大核心
2008年第1期144-147,159,共5页
Computer Engineering and Design
基金
江苏省自然科学基金项目(BK2003017)
教育部跨世纪优秀人才培养计划基金项目(NCET-04-0496)
关键词
自回归模型
模糊聚类
时序基因表达数据
动态模糊聚类
自相关性
autoregressive model
fuzzy clustering
time-course gene expression
dynamic fuzzy clustering
self-relationship