摘要
有效分类基因表达数据有助于癌症的诊断,而基因表达数据的高维数、小样本特点使基因表达数据分类困难。针对这个问题,在最小二乘回归子空间分割算法中考虑距离信息,提出融入距离信息的最小二乘回归子空间分割算法。融入距离信息的最小二乘回归子空间分割模型除了考虑数据之间的相关性,还考虑了数据之间的距离信息。在基因表达数据集上的实验结果表明,所提出的算法是有效的聚类方法。
Dividing the gene expression data properly is good for diagnosing cancers. However,the dimension of the data is high and the sample size is small,which increases the difficulty of the classification of gene expression data. In order to solve the problem,the paper proposes subspace segmentation via least squares regression including information about distance,which is based on subspace segmentation via least squares regression. The proposed method considers not only the correlation among data,but also the information about distance. Experimental results on genetic expression data show that the proposed method is effective.
出处
《微型机与应用》
2016年第6期63-65,68,共4页
Microcomputer & Its Applications
关键词
基因表达数据
聚类
距离
子空间分割
gene expression data
clustering
distance
subspace segmentation