摘要
针对现有的web文本分类与表示方法中出现的各种分类效果与性能优化等问题,提出基于局部潜在语义分析的理论原理,利用支持向量机分类优势,设计出一种基于文档与类别之间相关度的生成局部区域的算法,即S-LLSA。该算法在语义分析使用矩阵的奇异值分解过程中引入不同类别信息,分析特征词的局部特征,使用支持向量机分类器计算文本对类别的相关度参数,并应用于局部区域生成过程。通过实验表明,S-LLSA算法有效解决了局部区域如何进行局部奇异值分解问题,极大改进了web文本分类效果与优化问题,更好地表示了web文本潜在语义空间。
In order to effectively solve the multiple problems about web text categorization and representation, a local relevancy latent semantic analysis algorithm (S-LLSA) based on the theoretical principles of latent semantic analysis, and combined with support vector machine (SVM) classifier performance is designed in this paper. This algorithm introduces the category information in singular value decomposition (SVD), analyses local features of feature words, and uses the classify capability of support vector machine to select the local area. The experiment shows that the S-LLSA algorithm effectively solves the key problem of singular value decomposition, greatly improves the effectiveness of web text classification and better represents the web text latent semantic space.
出处
《控制工程》
CSCD
北大核心
2017年第8期1701-1706,共6页
Control Engineering of China