期刊文献+

TCBLSA:一种中文文本聚类新方法 被引量:15

TCBLSA:A New Method of Chinese Text Clustering
下载PDF
导出
摘要 根据隐含语义分析(LSA)理论,提出了一种文本聚类的新方法。该方法应用LSA理论来构建文本集的向量空间模型,在词条的权重中引入了语义关系,消减了原词条矩阵中包含的“噪声”因素,从而更加突出了词和文本之间的语义关系。通过奇异值分解(SVD),有效地降低了向量空间的维数,从而提高了文本聚类的精度和速度。 This paper presents a new method of text clustering by latent semantic analysis. This method establishes vector space model of term weight by the theory of latent semantic analysis, and eliminates disadvantageous factors. This method decreases the number of vector, and advances the speed and precision of text clustering.
出处 《计算机工程》 CAS CSCD 北大核心 2004年第5期21-22,37,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60275020)
关键词 文本聚类 隐含语义分析 奇异值分解 向量空间模型 Text clustering Latent semantic analysis Singular value decomposition Vector space model
  • 相关文献

参考文献2

二级参考文献19

  • 1黄萱菁.大规模中文文本的检索、分类与摘要研究.复旦大学博士学位论文[M].,1998..
  • 2[1]H H Bock.Probabilistic models in cluster analysis.Computational Statistics & Data Analysis,1996,23:5~28
  • 3[2]Chris Fraley,Adrian E Raftery.Model-based clustering,discriminate analysis,and density estimation.Department of Statistics,University of Washington,Tech Rep:380,2000
  • 4[3]Petri T Kontkanen,Petri J Myllymaki,Henry R Tirri.Comparing Bayesian model class selection criteria by discrete finite mixtures.In:D L Dowl,K B Korb,J J Oliver eds.Information,Statistics and Induction in Science (Proc of the ISIS'96 Conf in Melbourne.Australia,1996).Singapore:World Scientific,1996.364~374
  • 5[4]An Introduction to Cluster Analysis for Data Mining.http://www.cs.umn.edu/classes/Spring-2000/csci5980-dm/cluster-survey.pdf
  • 6[5]高等数理统计.超星数字图书馆.http://www.ssreader.com.cn.442~444(Advanced Mathematical Statistics (in Chinese),Superstar Digital Library.http://www.ssreader.com.cn.442~444)
  • 7[6]Jeff A Bilmes.A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models.Computer Science Division Department of Electrical Engineering and Computer Science,U C Berkeley,Tech Rep:TR-97-021,1998
  • 8[7]R E Kass,A E Raftery.Bayesian factors and model uncertainly.Department of Statistics,Carnegie-Mellon University,Tech Rep:571,1993
  • 9[8]I J Good.Weight of evidence:A brief survey.In:J M Bernade ed.Bayesian Statistics 2.New York:Elsevier,1985.249~269
  • 10[9]贝叶斯统计推断.超星数字图书馆.http://www.ssreader.com.cn(Bayesian Inferential Statistics (in Chinese).Superstar Digital Library.http://www.ssreader.com.cn)

共引文献91

同被引文献86

引证文献15

二级引证文献59

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部