期刊文献+

ISOMAP在中文文本聚类分析中的应用

Application of ISOMAP for Cluster Analyses Of Chinese Documents
下载PDF
导出
摘要 文本聚类中,文本特征向量的高维性使得对样本统计特征的评估十分困难,所以有必要进行有效的维数约简。ISOMAP是一类新近出现的非线性维数约简方法,可以有效地对文本特征空间进行降维处理,该方法改进了样本向量之间的距离度量,用测地距离代替传统的欧式距离,将高维的文本特征数据映射到2~3维的低维可视化空间上,达到数据降维目的,实现文本数据特征可视化,并在一定程度上解决聚类数问题。最后通过实例,验证了方法的可行性。 In text clustering procedure,it's very dimcult to evaluatc the statistical characteristics of samples because of the high dimensions,so effective dimensionality reduction is quite necessary. ISOMAP is a popular recent approach to nonlinear dimensionality reduetion method,it can reduce dimensionality effectively,this method improved the distance measurement between samples by replacing the classical Euclidean distance with the geodesic distance,then mapped text feature data from high-dimensional space into low-dimensional space(2 or 3 dimensions),therefore dimensionality was reduced,visualization for high-dimensional text feature data was realized,and a proper cluster number was obtained. At last,the experiment shows the validity of this method.
出处 《微型电脑应用》 2009年第8期25-26,29,共3页 Microcomputer Applications
关键词 文本聚类 等容特征映射 降维 数据可视化 Text clustering ISOMAP Dimension reduction Data visualization
  • 相关文献

参考文献6

二级参考文献45

  • 1Schena M,Shalon D,Davis R W,et al.Quantitative monitoring of gene expression patterns with a complementary DNA microarray [J].Science,1995,270(5235): 467470.
  • 2Lockhart D J,Dong H,Byrne M C,et al.Expression monitoring by hybridization to high-density oligonucleotide arrays [J].Nat Biotech,1996,14(13): 16751680.
  • 3Tavazoie S,Hughes D,Campbell M J,et al.Systematic determination of genetic network architecture [J].Nat Genet,1999,22(3): 281285.
  • 4Tamayo P,Slonim D,Mesirov J,et al.Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation [J].Proc Natl Acad Sci,1999,96: 29072912.
  • 5Toronen P,Kolehmainen M,Wong G,et al.Analysis of gene expression data using self-organizing maps [J].FEBS Lett,1999,451(2): 142146.
  • 6Tenenbaum J B,Silva V,Langford J C.A global geometric framework for nonlinear dimensionality reduction [J].Science,2000,290(5500): 23192323.
  • 7Cho R J,Campbell M J,Winzeler E A,et al.A genome-wide transcriptional analysis of the mitotic cell cycle [J].Mol Cell,1998,2(1): 6573.
  • 8Iyer V R,Eisen M B,Ross D T,et al.The transcriptional program in the response of human fibroblasts to serum [J].Science,1999,283(5398): 8387.
  • 9Cox T,Cox M.Multidimensional Scaling [M].London: Chapman & Hall,1994.
  • 10Ji X L,Li-Ling J,Sun Z.Mining gene expression data using a novel approach based on hidden Markov models [J].FEBS Lett,2003,542(13): 125131.

共引文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部