摘要
文献检索时通常会用到LSI(Latent Semantic Indexing)算法。针对算法中返回值受阈值大小影响的问题,对算法中由奇异值分解SVD(Singular Value Decomposition)得到的左、右奇异值矩阵,用k-means算法对其进行聚类,提出了LSI改进算法。实验结果表明,与传统的LSI方法相比,改进算法在提供k-means算法分类的维度时获得了更好的性能,证明了算法的有效性。
In the literature search,we usually used LSI(Latent Semantic Indexing) algorithm. For the problem of the return value of the algorithm is impacted of the size of the threshold, the algorithm by the SVD(Singular Value Decomposition) resulting left and right singular value matrix, can be clustering by k-means algorithm,LSI improved algorithm is proposed. The experimental results show that, compared with the traditional method of LSI, improved algorithm when providing k- means algorithm classification dimension obtained better performance, prove the effectiveness of the algorithm.
出处
《计算机安全》
2014年第11期18-23,共6页
Network & Computer Security
基金
山东省自然科学基金(ZR2011FL004
ZR2011FM035)
烟台市科技发展计划(2010167)
山东省高等学校科技计划(J11LG14)
山东省科学技术发展计划(软科学)(2013RKB01127)等项目
山东省高校智能信息处理重点实验室(山东工商学院)的资助