摘要
本文主要介绍MATLAB生物信息工具箱的数据聚类分析功能,该功能主要用于基因芯片数据的分析。将要分析的数据先转化成XLS格式的文件,通过函数xlsread读入MATLAB Workspace,存储为两个变量。对缺失数据进行估算,从而减小结果误差。函数clustergram对数据分级聚类,并产生数据的热红外分布图和树状图。通过更改相关参数可以改变其颜色配置,距离算法,并可做双向聚类。
This paper introduces the function of MATLAB Bioinformatics Toolbox supplies functions on data clustering analysis,which is used mostly to analyze the gene chip data.The data should be converted into XLS files at first,then read into the MATLAB Workspace by the function of xlsread and stored as two variables.Inputing values for defaults is helpful for diminishing resultant errors.The function clustergram is used to perform hierarchical clustering and to generate a heat map and a dendrogram of the data.By changing the relevant parametres,the color scheme and the distance arithmetic can be changed,and we can also perform biclu-stering.
出处
《现代生物医学进展》
CAS
2012年第17期3259-3262,共4页
Progress in Modern Biomedicine
基金
国家自然科学基金项目(50774102)